Predictive AI Layers for Databases
UncategorizedMindsDBPredictive AIWest 2020posted by ODSC Community October 19, 2020 ODSC Community
Anyone that has dealt with Machine Learning (ML) understands that data is a fundamental ingredient to it. Given that a great deal of the world’s organized data already exists inside databases, doesn’t it make sense to bring machine learning capabilities straight to the database itself via predictive AI layers?
Database users meet the most important aspect of applied machine learning, which is to understand what predictive questions are important and what data is relevant to answer those questions.
Bringing machine learning to those who know their data best can significantly augment the capacity to solve important problems.
To do so, we have developed a concept called AI-Tables.
AI Tables
AI-Tables differ from normal tables in that they can generate predictions upon being queried and returning such predictions as if it was data that existed in the table. Simply put, an AI-Table allows you to use machine learning models as if they were normal database tables, in something that in plain SQL looks like this:
SELECT <predicted_variable> FROM <ML_model> WHERE <conditions>
Automated Machine Learning and AI Tables
Automated machine learning (AutoML) makes the complex Machine Learning process from Data Acquisition to making a Prediction simple. All the steps in between are abstracted by an AutoML platform.
AI-Tables are also using the power of AutoML and allow users to train and test neural-networks based Machine Learning models with the same knowledge they have of SQL.
The AutoML engine behind AI-Tables is powered by a MindsDB Open-Source Pytorch based platform.
On top of that, it has Explainability capabilities that allow users to get insights into their Machine Prediction accuracy score and evaluate its dependencies. For example, users can estimate how adding or removing certain data would impact on the effectiveness of the prediction. It can be done through a database queries metadata or using a graphical user interface.
Those users, who want to have control over ML model feature engineering would be able to bring their own models to MindsDB AI-Tables as well.
How predictive AI layers work
The whole solution consists of two important parts:
- The Machine Learning models are exposed as database tables (AI-Tables) that can be queried with the SELECT statements.
- The ML model generation and training are done through a simple INSERT statement.
The following diagram illustrates this process:
The resource-intensive Machine Learning tasks like model training are executed on a separate MindsDB server instance so that the Database performance is not affected.
To really sink in this idea, let us expand the concept through an example.
The Example of Predictive AI Layers
Imagine that you want to solve the problem of estimating the right price for a car on your website that has been selling used cars over the past 2 years.
The data is persistent in your database inside a table called used_cars_data where you keep records of every car you have sold so far, storing information such as: price, transmission, mileage, fuel_type, road_tax, mpg (Miles Per Gallon), and engine_size.
Since you have historical data, you know that you could use Machine Learning to solve this problem. Wouldn’t it be nice if you could simply tell your database server to do and manage the Machine Learning parts for you?
At MindsDB we think so too! And AI-Tables baked directly to your database are here to do exactly that.
You can for instance with a single INSERT statement, create a machine learning model/predictor trained to predict ‘price’ using the data that lives in the table sold_cars and publish it as an AI-Table called ‘used_cars_model’.
INSERT INTO mindsdb.predictors(name, predict, select_data_query) VALUES ('used_cars_model', 'price', 'SELECT * FROM used_cars_data);
After that you can get price predictions by querying the generated ‘used_cars_model’ AI-Table, as follows:
SELECT price, confidence FROM mindsdb.used_cars_model WHERE model = "a6" AND mileage = 36203 AND transmission = "automatic" AND fueltype = "diesel" AND mpg = "64.2" AND enginesize = 2 AND year = 2016 AND tax = 20;
As you can see with AI-Tables, we are aiming to simplify Machine Learning mechanics to simple SQL queries, so that you can focus on the important part; which is to think about what predictions you need and what data you want your ML to learn from to make such predictions.
How to explore AI Tables
Currently, AI-Tables are working in the following databases and this list is constantly growing:
We are inviting you to join the talk by MindsDB CEO Jorge Torres at ODSC West 2020 conference to learn more about how AI-Tables work, see the demo, and ask questions, at “What if We Could Use Machine Learning Models as Database Tables?“
Additionally, you may visit MindsDB Github page, read the documentation, and ask questions at the community forum, to learn more about predictive AI layers.
Jorge Torres is the Co-founder & CEO of MindsDB. He is also a visiting scholar at UC Berkeley researching machine learning automation and explainability. Prior to founding MindsDB, he worked for a number of data-intensive start-ups, most recently working with Aneesh Chopra (the first CTO in the US government) building data systems that analyze billions of patients records and lead to the highest savings for millions of patients. He started his work on scaling solutions using machine learning in early 2008 while working as the first full-time engineer at Couchsurfing where he helped grow the company from a few thousand users to a few million. Jorge had degrees in electrical engineering & computer science, including a master’s degree in computer systems (with a focus on applied Machine Learning) from the Australian National University. |

ODSC Community
The Open Data Science community is passionate and diverse, and we always welcome contributions from data science professionals! All of the articles under this profile are from our community, with individual authors mentioned in the text itself.
MakeBlobs + Fictional Synthetic Data A New(ish) Use Case
Modelingposted by ODSC Community Nov 30, 2023
8 Tools to Protect Sensitive Data from Unintended Leakage
Modelingposted by ODSC Community Nov 29, 2023
Space-Time Hotspots: How to Unlock a New Dimension of Insights
Modelingposted by ODSC Community Nov 29, 2023