Editor's Note: John is a speaker for ODSC East 2022! Be sure to check out his talk, "You Too Can Be a Cybersecurity Data Scientist!" there!

I accidentally became a cybersecurity data scientist. I was in a PhD program, nearing graduation, and I realized that my niche interests in military topics were a poor match for the job market. So I adapted my skills–statistics, policy analysis, and programming–for the field of cybersecurity and became a cybersecurity data scientist, first working at an R&D lab and now at a startup.

You can intentionally become a cybersecurity data scientist. It doesn’t have to be an accident. In fact, one analysis based on 2019 U.S. job market data found that there were over thirty thousand job openings seeking candidates with both data science and cybersecurity skills. So if you’re a data scientist with even a little interest in cybersecurity and a willingness to learn, there’s probably a role out there for you.

Of course, I expect data scientists to have a number of worries about pursuing this idea. I’ve addressed four below, but a talk I’m giving at the 2022 Open Data Science Conference East will address even more and cover related topics.


Concern #1: I don’t know anything about cybersecurity.

Neither did I. The available job market evidence suggests that employers seeking cybersecurity data scientists tend to look for data scientists who have a familiarity with cybersecurity rather than cybersecurity specialists with data science skills. So stop worrying and start learning about cybersecurity. For a high-level overview, I recommend P.W. Singer and Allan Friedman’s Cybersecurity and Cyberwar: What Everyone Needs to Know. For someone who wants their reading to be complemented by hands-on learning, check out Clarence Chio and David Freeman’s Machine Learning and Security. The curious can also read my review of several recent cybersecurity and machine learning books.

Concern #2: I’m not interested in computer viruses or malware.

That’s okay. Cybersecurity is an extremely broad field and defies a single definition. The field is not simply about the latest compromise or obscure, intricate flaws in computers, but it’s fine if you are interested in those! Cybersecurity deals with traditional technical areas such as computer security, network security, and email security but also with topics such as disinformation and software supply chain security. If you’re interested in how computers can be misused and abused, then there’s probably a topic within cybersecurity data science for you. Those who are interested in what topics comprise cybersecurity data science can read more here.

Concern #3: I don’t know how my data science skills are useful for cybersecurity.

Many cybersecurity topics boil down to using computers to gather, analyze, and make sense of data about other computers and the users behind them doing bad things. Data scientists are good at exactly that. The skills of a data scientist are well suited to framing security problems with a crisp statement, a useful and surprisingly elusive skill. Additionally, determining what data is useful to collect is another core competency of a data scientist. Finally, using statistics and machine learning to build useful models for detecting malicious behavior of users and computers is another valuable contribution of a cybersecurity data scientist.

Concern #4: Where are the jobs for cybersecurity data scientists?

Cybersecurity data science roles are often not actually titled “cybersecurity data scientist,” which does mean it might take some extra digging. These roles are often found within security teams, rather than on a dedicated data science team, so it could be useful to identify companies you’re interested in and then search for jobs that involve both the keywords “cybersecurity” (or “security) and “data science” (or “data”). Companies with valuable digital assets are top employers of cybersecurity data scientists (such as Amazon or JP Morgan Chase). So are companies that specialize in defending other companies from attacks such as Sophos or CrowdStrike.

John Speed Meyers is a security data scientist at Chainguard. His interests include software supply chain security, open-source software security, and applications of data science to cybersecurity. He has a PhD in policy analysis from the Pardee RAND Graduate School.

