The backgrounds of those seeking or continuing a career in data science are varied. The one constant: a solid data science resume that will help you secure your first or next career move. Use this checklist to ensure your resume is not suboptimal.
Know the Code
Expecting a data scientist who codes only part-time to be on par with a full-time professional developer is a stretch. Thankfully, most employers temper their expectations.They do expect their data scientist to be a solid coder, however. Basic coding know-how is not difficult. If you have a deeper understanding of the main libraries and frameworks, list projects that demonstrate your coding expertise in your portfolio. For added benefit, make sure you’ve at least dabbled in major frameworks that you may not be expert in, especially some of the latest ones, like Keras. Better to say you’ve spent a day “investigating” a library or tool, than “not yet”.
To reiterate, there are many routes to data science. Unless you graduated with a PhD in data science or lower STEM degree, chances are your math is not up to scratch. Many data science math courses combine calculus, discrete math, linear algebra, and stats into one class. A decent career data scientist should have one or more courses in each of these subjects under her or his belt. A great data scientist takes courses in advanced probability, Taylor series, and Optimization Theory to name a few. Yes, math is hard, but immersion in the subject will pay off, and the Internet is awash with great free resources. Don’t leave your career to chance.
Data quality is paramount to good models.The consensus is that up to 90% of a data scientist’s job is cleaning data. According to Mike Stonebreaker at his ODSC East 2019 Keynote, 90% of the other 10% is spent fixing the data you thought you cleaned. OK, that’s a bit much, but you get the picture. Data wrangling is important, and it’s not always easy to reflect your expertise in your resume. But if you do, it will make you stand out. Bullet some challenges you’ve faced and how you tackled them. For more suggestions on experiences employers might be looking for, check out the Data Science Career Guide from Elite Data Science.
Much is made of coding and modeling skills. However, data manipulation is a skill in its own right. Firing up code is time consuming. Before you can even ingest your data for cleaning you need to be able to source and query it. Thus SQL and SQL-like languages are key productive tools. Going straight to the source has it benefits. You don’t have to wait on a DB admin or Lake admin to block your progress. The ability to source, access, query, and transform data are key skills for most data science jobs, and applicants will be expected to demonstrate proficiency in them.
Those who’ve built complex systems for finance, manufacturing, aviation, etc. are aware of the high cost incurred when software fails. Data science has the added issue of reputation cost. An inadvertent privacy breach, bias, or other ethical issue can be costly. Unless your new boss is oblivious to this concern (red flag), hiring managers will expect you to be well-versed in these issues. There are plenty of resources out there, including this tutorial on fairness and this paper on Transparency and Explanation in Deep Reinforcement Learning Neural Networks. Understanding how to scrub and protect data to avoid bias or breaches is a good skill to have in your pocket.
Too Little, Too Late
Answering “What’s the biggest non-media dataset you’ve ever worked on?” with a shrug and a “ few megabytes” in an interview is the wrong time to realize how lame this sounds. Despite the big data hoopla, most datasets we get to practice on don’t qualify as big data. Big data is a moving target, so let’s define big data as any dataset that requires specialized tools to view, manipulate, model, and display it. Working with large datasets is much more challenging. To observe, clean, transform, code, model, validate, and display these big datasets requires vastly different techniques. Get some Spark and NoSQL experience and get to work on large datasets that can be found online like the Free Music Archive 1TB dataset and others found on github.
What’s Your Problem?
Employers want data science to solve problems, business problems specifically. Review your resume’s experience section and list your experience by company project. Each project should include the languages, frameworks, algorithms, and models employed. Be sure to detail the problem the project sought to resolve and demonstrate how your data science expertise solved that problem. Even in greenfield projects such as “Built a recommendation engine from scratch,” include the business outcome, such as: “resulted in increase sales by 23%.” This will be sure to impress upon future employers that you grasp the business side of your work.
It’s All Relevant
Most data science university programs haven’t been around that long so there’s not many of us with a masters or PhD in data science. Most of us have degrees in other disciplines. When applying for a job you always call out what skills are relevant, so why not do the same for your education? Even if your degree isn’t technical in nature, aspects of your education are relevant to the job.
Find this relevant? Our Career Lab, such as the one coming up at ODSC West, is a great way to understand your career, learn possible pathways, and update your resume. Our Career Lab talks and workshops will give you all sorts of insight on continuing your career in data science. While there, you will have the opportunity to meet dozens of companies currently hiring, get feedback, and better understand your learning options.