fbpx
A Tale of Two Cultures: Integrating Data Science and MLOps to Build Successful ML Products A Tale of Two Cultures: Integrating Data Science and MLOps to Build Successful ML Products
Editor’s note: Thomas Loeber is a speaker for ODSC East this April 23-25. Be sure to check out his talk, “Integrating... A Tale of Two Cultures: Integrating Data Science and MLOps to Build Successful ML Products

Editor’s note: Thomas Loeber is a speaker for ODSC East this April 23-25. Be sure to check out his talk, “Integrating Data Science and MLOps: How to structure a collaboration and handoff process,” there!

When the excitement about data science became widespread about 10 years ago, this spurred a lot of proof-of-concept ideas. However, most of these stayed confined in Jupyter notebooks and never made it into production. There are multiple reasons why it has been a lot harder than initially expected to productionize ML models, but the one I want to focus on in this blog post is one that has not been explored in as much depth. In order to create business value, we have to marry two very different approaches: The ML lifecycle starts out on the exploratory data science side, but we eventually have to transition towards an engineering-driven approach in order to achieve the quality attributes such as availability, reliability, scalability, and security typically expected of production systems. Thus, what it takes to do good work in data science is fundamentally opposed to what it takes to do good work in MLOps, giving rise to different best practices, skill sets, and even mentalities (ways of thinking about problems) on each side. As a result, a central challenge for creating successful ML products is to find a good process for making these two different cultures work well together.

In-Person and Virtual Conference

April 23rd to 25th, 2024

Join us for a deep dive into the latest data science and AI trends, tools, and techniques, from LLMs to data analytics and from machine learning to responsible AI.

 

The challenge: Different needs, resulting in different best practices

In both data science and MLOps, we strive for agility: We want a tight feedback loop – which allows us to learn quickly what works (and what we even want and need) – and we want to be able to easily make changes to our product to incorporate these lessons. Crucially, however, the way to achieve agility varies widely between the two domains: 

In data science, agility comes from an explorative and iterative notebook workflow, which allows for rapid experimentation and hypothesis testing. This approach is highly flexible and fosters creativity, enabling data scientists to quickly adapt to new insights and iterate on their models. During this process, it is okay for data scientists to rely on quick hacks, as the goal at this stage is primarily to get something to work once, rather than to make this process work reliably and repeatedly. 

In contrast, MLOps requires a more rigorous engineering approach to ensure that models are robust, scalable, and maintainable in a production environment. The way we achieve this in MLOps is very different from the interactive notebook workflow in data science. Instead, we draw on engineering principles such as prioritizing clean code, clean architecture, continuous integration (including automated testing), and observability. These practices are essential to ensure the codebase stays (relatively) easy to understand, modify, and deploy, thereby facilitating rapid iteration and adaptation to changing requirements, even as the size of the application grows.

 

The way forward: Establishing a collaboration and handoff process

The first step in addressing this challenge is simply to recognize the unique needs and talents of both data scientists and ML engineers. From this vantage point, it is easier to avoid the pitfall of imposing a single set of best practices. We don’t want to boil both sides down to the lowest common denominator, or – even worse – have one side force its quality standards on the other side (e.g., because the team’s manager or tech lead is partial). Likewise, we need to be careful to keep our expectations realistic about how much can know about the other. For example, expecting data scientists to have more than a superficial knowledge of Kubernetes is probably unrealistic for the majority of companies. Not only do such failures to leverage a proper division of labor make it hard to hire and retain scarce talent, but they also distract each side from their respective key competencies.  Instead, we need to foster separate quality standards for each side to meet the specific needs of each domain. You can find my own attempt at defining the respective best practices for data science and MLOps here and here.

Since the two cultures are here to stay, the next step is to find a good process for both handing off work from data scientists to MLOps engineers, as well as to establishing a high amount of collaboration between them throughout the ML lifecycle. The latter point may sound somewhat counterintuitive: It is tempting to conclude from the fact that there is an inherent gap between both sides that the best solution is simply to keep them separate. Similar to how software development and operations were commonly done before the DevOps revolution, we may conclude that we should just leave each side to its own devices and that it is enough for data scientists to just “throw the models they trained over the wall”, and then let MLOps engineers worry about how to deploy them. However – as we learned from the DevOps revolution – we can achieve enormous efficiency gains if we try to align both sides more closely and make sure they pull into the same direction. 

In-Person Data Engineering Conference

April 23rd to 24th, 2024 – Boston, MA

At our second annual Data Engineering Summit, Ai+ and ODSC are partnering to bring together the leading experts in data engineering and thousands of practitioners to explore different strategies for making data actionable.

 

The same is true for ML products: Precisely because of the gap between the two cultures, it is even more important to ensure that what each side does is compatible with the needs of the other. The earlier we discover any such incompatibility, the cheaper it is to fix. For example, if data scientists fail to take important deployment requirements of MLOps into account, they may produce a model that cannot be used in production, requiring expensive rework (in the worst case, training the model again using different frameworks or data sources).

The best way to achieve a sufficient amount of collaboration is to establish cross-functional teams that own an entire value stream, combining data scientists and MLOps engineers into a single team. This is because it is a well-known fact that communication is much easier within a team than between different teams. In addition, it also aligns the incentives of data scientists and MLOps engineers, thus making it easier for them to pull into the same direction and work towards a shared goal. This advantage not only applies to the dev teams but also extends to the team’s leadership (in particular, having a single product owner in charge of the end-to-end ML product). [1]

There is no doubt that combining data science and MLOps into a single team is a daunting undertaking for many organizations, as the two sides may currently be part of different departments (e.g., data scientists may be part of various business teams, whereas MLOps may be part of engineering). Reorganizing these historically-grown team structures can thus be very contentious, as it entails a shift in the balance of power among different departments. Nevertheless, it has become clear that achieving the capability to productionize ML is not only hard but also vital for businesses to thrive in today’s economic climate. Therefore, prioritizing such a reorganization of teams deserves the executive support necessary to overcome any vested interests. As we have seen with the DevOps transformation, while such a large change in the team structure may initially seem impossible, it can still be accomplished if we decide it is important enough to make it a priority (though it may still take some companies years to complete).

Conclusion

In this blog post, we have seen that to leverage the full potential of ML, we need to solve the challenge of integrating the opposing cultures of data science and MLOps into a cohesive team. In my upcoming talk at ODSC East, I will delve in more detail into how to structure the hand-off of the main artifacts – code, models, and data – and at which specific points in the lifecycle collaboration is most important.

About the Author/ODSC East 2024 Speaker:

Thomas is a machine learning engineer at a business and technology consulting firm, Logic20/20, where he helps companies productionize ML models by adopting MLOps practices. He initially came from the statistics and data science side, but has also worked in software and data engineering, searching for lessons from these more mature disciplines for how to create maintainable and scalable software systems. Now, Thomas is passionate about integrating these diverse insights to build robust ML solutions.

Cover image by Geralt on Pixabay.

[1] Endnote: Note, however, that this recommendation to combine data scientists and ML engineers into a single team only applies to applied ML use cases. We may still have dedicated platform teams composed entirely of engineers that build internal MLOps tooling. These teams should still encompass the entire value stream (e.g., both front-end and back-end engineers), but that value stream may not encompass data scientists. To take another example, if a platform team builds tooling for data scientists, the team may consist of ML and software engineers, but the product owner would ideally be a data science manager.

ODSC Community

The Open Data Science community is passionate and diverse, and we always welcome contributions from data science professionals! All of the articles under this profile are from our community, with individual authors mentioned in the text itself.

1