You want to be working in machine learning and artificial intelligence, but you don’t have the talent yet. You’re telling your board members that you’re using AI when you’re really just doing some basic data analysis. You feel like everyone is working in ML and you’re here just trying to find even one person that can do the job. Too late. Facebook hired them out from under you.
Just kidding, but I get the anxiety. We’re on the edge of changing the very way we think about work, and the shortage of talent able to understand the concepts and build what needs to be built are in short demand. Even if you could find that one unicorn, the job tasks are too big for one person to do in one day.
AutoML could change all that. Last year DataRobot raised $54 million in a Series C round, which added up to $111 million in total funding. While opinions are mixed about whether we’re on the edge of democratization or a colossal anti-climax, it’s clear the interest is there. Let’s take a look at AutoML, the hype, and skepticism. Here’s what you need to know.
It’s Not Automated Development
Automated machine learning isn’t going to get rid of your need for data scientists and developers. Instead, it automates the process of algorithm selection, iterative modeling, hyperparameter tuning, and model assessment.
Think of it like this. You’ve got a dataset ready to go. You identify the labels and push a button to begin the model. You don’t have to choose which algorithm is appropriate. You don’t have to build the framework. You get an optimized model that’s ready to go with little expertise needed.
It’s not going to frame your question or build your benchmarks. It will let you focus on your business problem instead of getting lost in a technical process you aren’t familiar with. You upload your dataset, handle the labeling, and then the rest is handled behind the scenes.
You Should Consider Transparency
Having access to code doesn’t necessarily cause you to ask the right questions of your algorithms, but like other areas, automation can deflect responsibility. The process does certainly automate, but you could be in danger of using models that confirm your desired outcome.
Automation requires a continued dedication to asking the right (sometimes difficult) questions of our models. Guideposts should punctuate any automated pipeline to ensure you’re still responsible with your data and using it for what it shows. Without those safeguards in place, you could find yourself and your team abdicating responsibility for data processes to the mysterious process.
Scale, scale, scale. You don’t want to implement a process without an eye for growth. With AutoML, your process evolves as your company does (in theory). Some support exporting a fully trained model to a different platform (expanding into Apps, perhaps?) Developers can integrate without having to take a crash course in data science infrastructure.
AutoML models can be exported to Docker containers, good news for your DevOps team. They can deploy the models at scale and host containers in scalable clusters in Kubernetes, for example. AutoML is flexible but portable. You’re using customized data within a preprocessed pipeline.
It’s Hype… For Now
Despite the fundraising slam dunk for DataRobot, we haven’t quite perfected AutoML just yet. The big names are working on things, Google Cloud AutoML for example, but they’re focused more on improving their deep learning architecture. When that happens, it could mean a transformation for how we view the entire process of automation.
For now, we’ll call it “democratization lite.” It’s going to take a few mundane tasks off the hands of your data scientist or help a business analyst move into the role more fully without a full-scale data science algorithm training. It’s not going to automate innovation or test new hypotheses for you. It’s not a magic bullet that will turn around your failing data science department.
The hype around AutoML could be really good news, however, as companies begin to perfect the process or offer it “as-a-service.” For companies struggling to find the talent and resources needed to complete and maintain an ML pipeline, it could be just the tweak to take some pressure off.