Are Successful Data Scientists Hired or Trained? Are Successful Data Scientists Hired or Trained?
Editor’s note: Jennifer is a speaker for ODSC West 2019 this November in San Francisco! Be sure to check out her... Are Successful Data Scientists Hired or Trained?

Editor’s note: Jennifer is a speaker for ODSC West 2019 this November in San Francisco! Be sure to check out her talk, “Successful Enterprise Analytics Starts with Literacy” at this upcoming event! This talk asks: are successful data scientists hired or trained?

The data science valley of despair is real. Time after time, leaders who’re well-versed in case studies and industry research extolling the returns of data-driven insights seek to innovate their business – and land in a hole of frustration and write-offs. It may be more accurate to call it a crater of despair given that Gartner predicts 85% of data science projects will fail (2018). What do the 15% of successful data science projects have in common? A lot – including careful consideration regarding whether the data scientists working on a given project were hired immediately after graduating from an analytics program or were existing employees upskilled in-house.

[Related Article: How Scouting an AI Engineer Should Change Your Hiring Strategy]

On the surface, it may seem inane. Now that leaders can hire a candidate right out of school with a bachelor’s or master’s degree in data science who is well-versed in the latest and greatest tools and techniques, why would they bear the cost and time delay to train one instead? Assuming you had the time and ability to replicate a world-class data science program, wouldn’t it be at best, inefficient and at worst, ineffective?

It depends on your domain - or more specifically, your data’s complexity and lineage.

Formal data science education delivered by universities, MOOCs, and other means can only effectively cover 2 of the 3 interdisciplinary skills required to be successful in the role: statistics and computer science. The 3rd interdisciplinary skill, domain knowledge, cannot be taught en masse because it isn’t consistent across industries—or even companies. No institution can teach the intricacies of your data. There will be a knowledge gap. The question is, how wide? Crater? Valley? Or navigable pass?

Data is a language—every company, if not every business unit, speaks its own dialect. As with the spoken word, these differences came about organically, and vary or evolve based on the group’s needs. Remember life before “bling?” The same is true of “channel partner.” These dialects become especially confusing for general terms which don’t conform to a common taxonomic definition. For example, IT’s “customer” is likely an employee, whereas Sales’ “customer” is typically an individual with purchasing power, who may be different from the “end user” who is referred to as the “customer” by your company’s external contact center.

Restated—domain knowledge is the learned skill to communicate fluently in a group’s data dialect. Its component parts are: general business acumen + vertical knowledge + data lineage understanding. For example, a data scientist in people analytics requires a foundational knowledge of the business + human resources + the inner-workings of their company’s HR tools and processes which create the data they work with. Those processes and other inputs to the dataset are crucial. A data scientist can’t create meaningful insights before they understand what the data is saying today. Is it telling a story? Is it, or subsets of it, too polluted to use today? Are some data points proxies for or inputs to others? The more complex your business processes and associated data lineage, the longer your data dialect will take to learn.

[Related Article: The 4 Most Important Traits to Look for When Hiring an AI Expert]

For digital native companies whose data collection is automated with intuitive dialects (i.e. a “click” is a “click”), domain knowledge can be developed much more quickly than for large, longstanding companies which have undergone transformations, acquisitions and/or divestitures.

If you hire a data scientist, how long will it take them to learn your data dialect? And can you provide air cover for them to do so before applying pressure to produce “insights?” Would it be faster or more effective to upskill someone (i.e. a business analyst or developer) in the areas of statistics and computer science they aren’t already well versed?

The real question is—what makes the most sense for your project(s)? Hiring data scientists? Developing successful data scientists? Or would a team comprised of both types help you avoid data science crater of despair?

Editor’s note: Jennifer is a speaker for ODSC West 2019 this November in San Francisco! Be sure to check out her talk, “Successful Enterprise Analytics Starts with Literacy” then!

Jennifer Redmon

Jennifer Redmon

Jennifer Redmon joined Cisco in 2009 and serves as its Chief Data Evangelist. Her organization enables an insight-driven culture through globally-scaled data products, services, and community enablement. In response to the shortage of data and analytical talent in the marketplace, her team has upskilled over 3,000 employees to date in the areas of data science, artificial intelligence, data storytelling, and data engineering. By hosting virtual and physical events including AI/data science competitions and symposiums as well as always-on collaboration platforms, her organization interconnects and fosters a thriving federated community of practitioners who drive innovation across functions and geographies. Jennifer holds an international MBA from Duke University with a concentration in Strategy and Bachelor’s in Economics and Art History from UC Davis.