Although people have been doing statistics with programming for a while, we are still in the early stages of the formal function of data science and machine learning engineering in the industry. So far, we have a well-established list of hard technical skills that one can learn to become proficient in this space. These include: Python, SQL, Data Analysis & Visualization, Feature Engineering, Modeling, Productionizing Models, etc. Most are also aware of other skills that don’t easily make it job boards: communication, project management, etc. These invisible skills often get labeled as simply “soft skills.”
The idea is that to be an expert you just need to perform all of those hard & soft skills at a high level for a long enough time. I’ve spent a lot of time closely observing and learning from high-performing data scientists, and honestly, their excellence goes far beyond these skills. Being able to write clean and efficient code at a fast speed or a phenomenal communicator will take you very far in this field, but I’ve noticed a set of patterns that dramatically distinguish the greatest from the rest.
These patterns are largely invisible to the extent that sometimes even the person themselves can’t articulate what they do differently from their peers. Over time, I’ve seen those who exhibit most or all of these skills in spades end up seeing an astounding amount of success. From being able to design effective machine learning systems that work at scale to building strong relationships with business partners so people trust your word, Data Scientists who embody these snvisible skills have no competition.
Calling these concepts “skills” is almost a disservice because they are more personality frameworks that take far longer to master than traditional skills do. Here’s the list:
- Foxes and Hedgehogs
- Insatiable Curiosity
- Quick to Relinquish Ego
- Generously Collaborative
An important disclaimer: I am not claiming that you’re not qualified if you don’t have all of these “invisible skills” in spades. High proficiency in Python, statistics, and communication is still, obviously, incredibly valuable. I’m only claiming that these are the set of characteristics that I’ve seen the strongest religiously use and the excellence gap becomes quite obvious due to their results.
Foxes and Hedgehogs
A “fox” is typically a person who moves at a quick pace across a breadth of concepts and domains at a shallow depth. Although they aren’t inherent experts in any one domain, they exhibit a large amount of range in their knowledge. They are comfortable with nuance as they live in a world of probabilities.
A “hedgehog”, in contrast, is the personification of depth. They can know one thing incredibly well and are typically more specific in what they aim to learn. Combining a specialized skillset with the focus on the big picture, they can reduce every problem to one organizing principle.
Each personality type has moments where it really shines and I’ve seen the best Data Scientists effectively dance between the two depending on what the team needs.
At the beginning of projects, when little information about the data or the strategy is known, it’s best to be an adept fox. You should be able to traverse a large amount of information quickly to identify signals in a sea of noise. If you go too deep into a narrow solution, you’re likely to miss crucial information that could completely kill your project later on. A strong fox ability is needed to distill a large volume of incomprehensible information into concrete signals and noise.
After a reasonable amount of information has been distilled, it’s usually more useful to be a hedgehog. You should have an idea of what the problem is, what strategies could alleviate and solve the problems, why those solutions would work, and how to build those solutions. This requires a depth that most foxes don’t have (and shouldn’t, as it’s not their purpose).
Balancing breadth and depth while dealing with the constraints of reality — costs, time, compute, people, etc. — is only done effectively by experts.
The evolution of resiliency. Resiliency is the ability to bounce back from failures. Antifragility is the ability to grow specifically due to failures. A resilient system can quickly come back online after outages, but a system that is antifragile becomes stronger and better due to the outage.
This is a concept coined by one of my favorite authors, Nassim Taleb, in his book, “Antifragile: Things That Gain From Disorder”. Once learning of it, I started to recognize it in large quantities in expert Data Scientists. When you think about it, it makes a lot of sense: having a lot of experience working on applied machine learning projects exposes you to numerous amount of failures while still requiring you to make decisions based on statistical theory. This judgment keeps sharpening over time until you reach a point where you can know not only where the pitfalls are, but also which pitfalls will yield the right information to propel forward.
This is a hard skill to learn if you aren’t exposed to failures often. You need to develop judgment of what will likely lead to a failure so you can control which failures you are willing to take on for the knowledge and growth it will return in consequence. Combine this with having to lead a project and team to go through this together and it becomes a really difficult yet imperative skill to master.
In a world of abundance of information and energy, intellectual curiosity is what differentiates you from the rest. It’s easy to feel like an expert due to the abundance of free information we have out there, but when you look at the outputs it shows the divergence between reading about producing quality and producing quality.
One way we see this play out in the data science field as people feel they’re performing at a high level just by blindly doing this 4 step process:
- import sklearn
There isn’t anything inherently wrong with any step, but not having the curiosity for why certain models perform better than others for your use case (in terms of speed, performance, etc.) is really detrimental. Anyone can read one page of sklearn docs and figure those 4 steps out, but it takes expertise to know that you can still overfit even when using GridSearchCV, how to alleviate it, why certain models should be chosen or not, whether you even have a modeling problem or a data problem (hint: it’s usually a data problem), and so much more.
Intellectual curiosity is the +1% improvement experts make for themselves which translates to incredibly large advances over time. They take the extra 5 minutes to read an article that contradicts their beliefs, don’t take the easy route in implementing things they don’t understand or can’t explain, and want to be builders before sellers.
This ability is also the source of the “lifelong learners” and it keeps refueling energy to keep refining one’s craft or level up your abilities. This field has too many concepts and technologies that change faster than any one person can keep up, so to tread water you have to have a trait that keeps your energy levels high.
Quick to Relinquish Ego
This can be a tricky balance especially when you know a concept deeply, but it is quite imperative because usually, success doesn’t come down to theory. We’ve all seen the effects of a person holding too tightly to their knowledge and refusing to see a different point of view. This more than not leads to ineffective solutions and relationships.
Even with the vast knowledge expert Data Scientists have, I have witnessed their attempt to be convinced otherwise. They don’t strive to be the smartest in the room, they want knowledge to be shared as equally and openly as possible. This happens through meaningful debate and a pursuit of the truth. In practice that can mean they are slow to reveal what they know and what their opinion is, experts want to hear others’ thoughts before they provide theirs. It is pretty fascinating that I’ve seen inexperienced practitioners tout their beliefs far more loudly and proudly than I have of experienced ones, even when their beliefs aren’t true.
Experts realize this is a people’s craft meaning that theory and methodology are only one small part of a successful project. The relationship with your team and stakeholders is as important as getting your theory right. Bringing your ego to discussions does not ever foster a strong relationship with others around you, even if you are right but especially if you are wrong.
More than often, success also is not about theory it’s about the application. And the application of theory can have a lot of right answers, where theory usually only has one. The myriad of right answers means it’s in your favor to vet out a variety of opinions and make decisions rooted in evidence and sound judgment. This is not compatible when you’re leading with ego.
As I have mentioned in some of my previous stories, this is so much bigger than good data visualization. Storytelling is the art of communicating a complex idea clearly and concisely so people are meaningfully influenced by it. It can have data visualizations incorporated, but it largely is about influencing and less about technical ability.
Just like how communication composes of voice inflection, mannerisms, body language, etc. but is much larger than the sum of its parts, influencing is also the same. To influence, you need to build a trusting relationship first. This is not at all easy to do in a short amount of time, let alone the 30-minute meeting you may have to do it all in.
Furthermore, humor is an incredibly underrated skill here as well. Humor does wonders in bringing people closer together and is probably more needed when everyone’s remote. Being able to make people laugh reminds them that this is ultimately a human craft that starts and ends with people making decisions.
A [very] large mistake I have seen people make here is giving too much information. I rarely see people give too little information because if that’s the case they usually don’t have more information to give. The larger problem is when people feel they’re skilled at communicating but inundate people with every iota of information relevant to the project. Not only is this ineffective communication, but it’s also not at all storytelling. Experts know how to parse signals from noise and will communicate just the signals which guide those in power to make decisions.
I understand it’s much harder to regulate this, but I have seen a number of data science projects not launch or succeed simply because the choice of who leads the talk was poor. Not everyone has this skill, be wary of who talks a lot and who influences.
It should be evaluated in interviews how Data Scientists contribute to and enrich communities. Being an active part of a community is a strong indicator of excellence. You can’t learn everything yourself and the process becomes much more effective if you learn with peers within and outside your organization. I question the knowledge of those in this field who aren’t actively learning from people around them. Experts know how to leverage their network at a high level.
Regardless of the global community, I have seen experts be incredibly generous with their time and knowledge. Experts usually make time to help immediately and spend the extra few minutes fully understanding what you need help with whereas more inexperienced individuals shy away from things like this as it’s “not in their job descriptions”.
In one of the best interviews I’ve given for Data Science, the Lead Data Scientist asked me an advanced question regarding vectorization that was just at the edge of my understanding to which I answered but followed with, “I kind of know this but feel like there’s a little gap in my understanding here”. He explained the correct answer like it was the easiest thing in the world and followed with, “It’s a shame if two Data Scientists are in a room together and don’t learn something from each other”.
This kind of “relentlessly helping each other” mentality makes for the strongest and healthiest data cultures. It’s not to say that someone will be able to solve all of your requests, but more that you will always have people around you to help try. When you have a field that changes at a breakneck pace, this can be one of the most comforting environments to be in.
I’m curious what other practitioners in the field feel about these mental frameworks — have you seen these or others that are hard to pinpoint but differentiate the most experienced? I’m sure there are more that I missed and would love to hear what you feel separates okay from excellent here.
About the Author on Invisible Skills for Data Scientists: Ani Madurkar
I’m a Senior Data Scientist by day & an artist by night. I’m a person who deeply loves storytelling, and all my passions circle around this point. From Philosophy to Photography to Data Science, I enjoy crafting interesting and insightful stories.
I work for a boutique consulting firm, Fulcrum Analytics, based in New York City and build enterprise Machine Learning systems for top companies and startups. In my spare time, I read lots of books on philosophy, psychology, business strategy, statistics, and more. I love refining my craft to get as close to mastery as I can so I will often have a passion project or two I’m working on, which I try to document and share on this platform and my Twitter.
Outside of projects, reading, and writing, I’m traveling and taking pictures of landscapes. Feel free to check out my website for some of my work (more to be uploaded soon) or my Instagram if you’re curious (at)animadurkar.