7 Top Machine Learning Operations (MLOps) Startups and the Problem They Solve
Business + ManagementFeatured PostMLOpsposted by Sheamus McGovern March 10, 2022 Sheamus McGovern
Machine Learning Operations (MLOps) is a very hot space within the already rapidly-accelerating growing AI market. The MLOps market alone is expected to grow to almost 4 billion by 2025. Given the already crowded space for AI and MLOps startups, we took a look at some of the top MLOps startups and asked a question – what problem does their startup solve?
#1 Weights & Biases – New Dev Tools for Machine Learning Developers
Stage: Series C Startup | Total Raised 200M
Top Investors: Gaingels, Insight Partners, Coatue, Bond, Hack VC
Github Link: https://github.com/wandb | Website: https://www.wandb.ai
Founded by Lukas Biewald and Chris Van Pelt, Weights & Biases is a developer-first MLOp platform that offers performance visualization tools for machine learning. It helps companies turn deep learning research projects into deployed software by helping teams track their models, visualize model performance, and easily automate training and improving models.
Problems They Solve
Developers have long had a suite of tools to build and deploy code such as code editors and deployment tools like Jenkins. Weights & Biases is building a similar stack for machine learning practitioners that includes editing and visualization, experiment tracking, and model management. This matters because there is a fundamental difference between software development and machine learning development workflows.
Image Credit: wandb.ai
#2 Iguazio – MLOps Pipeline Automation and Feature Engineering
Stage: Late Stage | Total Raised 72M
Top Investors: Dell Technologies, Pitango VC, Robert Bosch VC, JVP, Samsung Ventures
Github Link: https://github.com/mlrun/mlrun | Website: www.iguazio.com
Founded in 2014, this Tel Aviv-based MLOps startup was one of the first entries in the MLOps space. Their data science platform is focused on ML workflow to allow teams to develop, deploy, and manage AI applications, It also allows ML models in real-time and deploy them anywhere including multi-cloud, on-premises if needed, and even on edge devices.
Problems They Solve
MLOps involve a complex workflow. The Iguazio Data Science Platform includes all the most common machine learning open source tools pre-installed including pipelining essential for MLOps and an integrated feature store. The platform is focused on enabling python first data scientists and machine learning engineers with an emphasis on Kubernetes for cloud and multi-cloud support.
Image credit : www.iguazio.com
#3 Iterative.ai – Git-Based Tools for Data, Models, and Pipelines
Stage: Series A Total Raised 25M
Top Investors: 468 Capital, True Ventures, Afore Capita
Github Link: https://github.com/iterative | Website: www.Iterative.ai
Iterative.ai is an MLOps platform that develops lifecycle management for datasets and machine learning models. Iterative.ai brings engineering practices to data science and machine learning. It maintains a code repository with data files, machine learning model files, and model metrics. It keeps track of machine learning experiments to share knowledge about ideas. Founded in 2018 and is headquartered in San Francisco, California.
Problems They Solve
Only recently have ML practitioners started to pay attention to versioning. Data & modeling versioning, machine learning (ML) testing, and ML environment versioning are all key elements of the new MLOps workflow. Iterative’s open source approach provides access to what is essentially git for data, models, and pipelines all these elements and layers on collaborative tools. Data Version Control and Continuous Machine Learning, common in software engineering, are now becoming essential to engineering to data science and machine learning practices.
#4 Arthur AI – Model Monitoring and Counterfactual Explanations
Stage: Series B Total Raised 60.3M
Top Investors: Acrew Capital, Greycroft, Index Ventures, Work-Bench, Homebrew
Founded in 2018, New York-based Arthur AI is a platform that monitors the productivity of machine learning models and provides insights on optimization, explainability, and bias detection and mitigation. It also allows custom metrics, intelligence alerts, data drift detection, and counterfactual explanations.
Problems They Solve
Model performance slippage is an ongoing problem in machine learning production environments. Companies are realizing that, unlike regular software, machine learning systems need robust risk mitigation processes in place. Alert systems that identify bias and model drift, and other features are part of that mitigation. Also included are counterfactual explanations which are an important emerging technique that provides what-if scenario analysis for model interpretability.
#5 arrikto – Automated Machine Learning Operations (AutoMLOps)
Stage: Series A Total Raised 15M
Top Investors: Unusual Ventures, Odyssey Ventures L.P.
Github: https://github.com/arrikto Website: www.arrikto.com/
Arrikto’s MLOps platform allows data science and machine learning operations teams to collaborate together to continuously build, train, deploy, and serve machine learning models with DevOps efficiency. Arrikto wraps kubeflow to allow you to build on a single node locally and then deploy to a multi-node production environment. Their automated ML workflow helps data scientists to create Kubeflow pipelines for MLOps by tagging cells in Jupyter Notebooks to define pipeline steps, hyperparameter tuning, GPU usage, and metrics tracking. This can then be used to automatically create pipeline components and KFP DSLs (a decorator for Python functions that returns a pipeline)
Problems They Solv
The open-source projects, Kubeflow and MLflow, dominate the MLOps landscape and Arrikto is squarely in the Kubeflow camp as an enterprise solution. However, the real solution is to treat data like code and allow software engineers or machine learning engineers to push changes from their load environments such as a Jupyter notebook into production. This is an interesting data-first approach to MLOps. Additionally, the ability to migrate containerized workflows that include apps plus data across machines or clusters eases the burden of orchestrating of pure Kubernetes workflows.
Image Credits: Arrikto
#6 Seldon – Open Source Explainability, Interpretability, and Drift Detection
Stage: Series A Total Raised 13M
Top Investors: Cambridge Innovation Capital, Playfair Capital, Amadeus Capital Partners
Github: https://github.com/SeldonIO Website: www.seldon.io
Seldon is a machine learning deployment platform that includes several open-source projects that includes model serving, explanations, and monitoring. Their most popular open-source projects include Alibi, Alibi-Detect, and Seldon-core; an MLOps framework to package, deploy, monitor, and manage thousands of production machine learning models. Alibi is a set of algorithms for explaining machine learning models and Alibi-Detect is a report for identifying outlier, adversarial detection, and drift detection.
Problems They Solve
Explainability is a well understood but somewhat underserved problem with SHAP being perhaps the most well known. Alibi expands on SHAP (included also) with a set of model explanations that include anchor explanation for images, integrated gradients for text, counterfactual examples (including reinforcement learning), and local effects models. Its sister project Alib-Detect includes 10 drift detection models in total including isolation forest, AE, VAE, AEGMM, and Seq2Seq. The drift detection suite includes another dozen models that include statistical methods like Kolmogorov-Smirnovand Maximum Mean Discrepancy.
Stage: Seed Total Raised Unknown
Top Investors: Y Combinator, Flybridge
Data teams can use the Metaplean observability platform to save engineering time and increase confidence in data by understanding when things break, what went wrong, and how to fix it. This is done by automatically monitoring modern data stacks that include warehouses and BI dashboards, identifying normal behavior (e.g. lineage, volumes, distributions, freshness), then alerting the various stakeholders when issues occur.
Problems They Solve
There is a strong movement in AI to make a data-first vs model-first approach. The reasons are obvious. Data bugs are often more prevalent and more expensive in data-driven platforms than code bugs. However, codebases tests are generally not robust enough to capture every inconstant feature of data flows. Metaplane’s suite of built-in tests with configurable tolerance lifts that burden from developers.
Learn more about MLOps Startups at ODSC East 2022
All of these MLOps startups will be represented at ODSC East 2022 as part of the MLOps track, and the AI Expo & Demo Hall. The AI Expo & Demo Hall is free to attend! If you’re interested in checking out the MLOps track, then register here for 50% off all ticket types. Here are a few standout sessions in the track:
Computational Survival Analysis: Allen Downey, Professor, Olin College Author of Think Python, Think Bayes, Think Stats |
Machine Learning for A/B Testing: Alex Peysakhovich, Senior Research Scientist, Facebook AI Research
Few-shot learning: Isha Chaturvedi, Principal Data Scientist, Capital One
Orchestrating Reproducible AutoML Experiments using PyCaret, W&B, and Prefect Anish Shah, ML Engineer, Weights & Biases
Intro to Deep Learning using Keras and Tensorflow: Julia Lintern Data Science Instructor Metis
Self-Supervised and Unsupervised Learning for Conversational AI and NLP: Chandra Khatri, Chief Scientist and Head of AI, Got It AI
Community Data: What to Measure and Why?: Cali Dolfi Data Scientist Red Hat
Introduction to interpretability in machine learning: Andras Zsom, Assistant Professor, Brown University
Understanding and optimizing parallelism in NumPy-based programs: Ralf Gommers, Co-Director Quansight Labs
Rapid Data Prototyping: Exploring COVID-19 Time-Series Data: Eric Salituro, Senior Software Engineer, Zendesk
AI for climate action and agriculture using PyTorch: Engine Isabelle Tingzon, Machine Learning Researcher, Thinking Machines Data Science