Data scientists shouldn’t be hailed as unicorns — they’re more like jackalopes, Jesús Rogel-Salazar said in the final moments of ODSC Europe 2018.
Rogel-Salazar presented a basic introduction to data science and its practical business applications during the Europe conference’s final timeslot. He said over-exaggerated perceptions of data science as “magic” lead to less practical applications of data in business strategy. So, he presented better strategies for newcomers to data science.
Defining data science
Data science is not one simple thing: its definition depends on what a user employs it for, Rogel-Salazar said.
For some people its applications are more technical; for some it’s more managerial; and there’s an entire spectrum in between and beyond. Rogel-Salazar’s working definition for the aim of data science is “to help customer-focused organizations accelerate the impact of their data on their customers and their business.” Data science should not star the show, rather it is ancillary to business operations and helps inform business decisions, he said.
He also quoted Hillary Mason, who defined data science as “the combination of analytics and the development of new algorithms… you may have to invent something, but it’s okay if you can answer a question just by counting. The key is making the effort to ask the questions.”
Neither of these definitions, he was sure to note, include anything about a model or algorithm.
Data scientists as jackalopes
Many outsiders to data science imagine that mystical unicorn programmers make large sets of numbers into usable information, and that the process looks something like this:
But Rogel-Salazar said in reality, data scientists can answer many business operations inquiries with some simple statistics, probability, and even just counting.
What’s important, he said, is to start with a specific question, and then find the best way to answer it. Instead, businesses often point to a dataset and ask data scientists to tell them something they don’t know. In those cases, Rogel-Salazar said there are too many unknowns to determine the right path to take with the data.
“Tell me what it is you would like to achieve, and then we can assess whether the data is suitable for what you’d like to do, rather than say ‘I’ve got this data, use some fancy model just for the sake of it,” he said.
When people write job descriptions seeking data scientists, Rogel-Salazar said it’s like they’re hunting for unicorns, too: they want someone with the technical skills who is curious, has a lot of questions, and is a great speaker and writer.
Instead Rogel-Salazar compared data scientists to jackalopes because jackalopes are “mythical, but with a hint of reality.”
“Being a data science requires people and technical skills, that is true, but it’s difficult to encompass one that has all of those things,” he said. “Data science is therefore a team sport. Get the cottontail rabbit from one bit, the antelope horns from another one and make due with a team, not just one person.”
Rogel-Salazar’s recommendations for data scientists
Near the end of his talk, Rogel-Salazar layed out advice for his audience:
- Make sure to have lists of requirements for your projects
- Know what’s needed from your model, if you build one
- Dedicate time to communication and mutual understanding with those requesting your model
- Data preparation is almost always necessary, so brace yourself
- Start with a simple model, like a regression, and figure out where to go from there
An organization almost always generates more data than they know about, or than a data scientist can even imagine. All of these datasets can provide businesses with actionable insights. Rogel-Salazar challenged his audience to exhaust the value from all of their data sources, to combine data from multiple sources for greater insight, and to open their companies’ minds to other possibilities in the data science realm.