Agile Data Science Insights from ODSC East 2018 Agile Data Science Insights from ODSC East 2018
If it’s one thing your data scientists want you to know about agile, it’s that they don’t want to be forced into unnecessary meetings... Agile Data Science Insights from ODSC East 2018

If it’s one thing your data scientists want you to know about agile, it’s that they don’t want to be forced into unnecessary meetings and micromanagement.  The top-down heavy ‘agile’ practices across many organizations have led to widespread skepticism and even downright rebuke from some data scientists. Ironic, given that agile refers to lightweight methodologies that were founded on values such as ‘individuals and interactions over processes and tools’ by self-described organizational anarchists.

In the talk ‘Key Questions to Ask When Managing Data Science Projects’ given by Jeffrey Saltz, an Associate Professor at Syracuse University, at ODSC East 2018, he talked about some of the reasons why data science teams struggle with agile methodologies like Scrum.  

The key reason for the struggle is estimation.  When you have work that requires Exploratory Data Analysis (EDA) as the first phase of a project, then you can’t accurately estimate the time it will take to do that nor even know what future work may be required.  Therefore, for ambiguous data science projects or data sets that are not well understood or structured, something such as the two-week sprint as prescribed by the Scrum methodology can be problematic. Professor Saltz recommended the Kanban agile method as an approach that tends to work better than Scrum for data science projects. CRISP-DM or the related ASUM-DM methodologies are also potentially useful for data science projects.  Saltz stated that, generally, the teams he works with are using a hybrid approach based on their unique needs.

In Russell Jurney’s talk ‘Agile Data Science 2.0’, he gave advice based on his principle consultant role and experience at Data Syndrome.  He recommended that application development should be a big part of what data science teams do, so that the work that is done becomes visible to users and stakeholders.  This ties into the agile concept of showcasing and doing demos of work products. Jurney also talked about shipping the “broken stuff” which, if management has the right agile mindset and the team feels safe enough, can enable the exchange of valuable feedback.  Most data scientists will understandably be uncomfortable with this concept of sharing broken or non-working stuff. Therefore, management will have to work hard to inspire confidence that they can be trusted.

Other speakers at ODSC East 2018 talked about having deployed more agile methods in recent years.  As you deploy agile methodologies, you should work to avoid negativity by association. For example, be careful not to roll-out a new agile methodology at the same time as you have a layoff or resource re-balancing effort.

There are some key ideas in agile methodologies that lead to inflexibility in practice, such as Shu-Ha-Ri.  This boils down to the idea that you need to follow a methodology exactly as designed by its originators until you have done it long enough that you know how to adapt it.  The problem in implementing agile in this way is that it’s nearly impossible to have everyone bought-in on the length of time it takes to do things in a way that doesn’t work just because an agile methodology creator designed it that way.  Nobody’s got time for that.

The key is to empower your data scientists, inspire them, give them impactful work to do, bring them food and water, and get out of their way.  Don’t put too much focus on the buzzwords and prescribed meetings for your agile methodology. If you get back to basics on what the agile mindset was intended to be – then you will avoid the pitfalls and leverage the best of what made agile so popular in the first place.

Heather Domin

Heather Domin

Heather Domin is Senior Technical Program Manager, Data Science and Cognitive Solutions at IBM. She regularly advises managers and executives on agile, data science, metrics and management systems. She leads with insights from her technical experience in engineering and data analytics, and has published articles on emergent technology and processes. Certified in agile and traditional project management methodologies (PMP, PMI-ACP, CSM), she currently leads large scale global programs for IBM’s Enterprise Operations & Services division.

Open Data Science - Your News Source for AI, Machine Learning & more