With any new skill, hobby, or career path, you likely have more questions than answers. How do I get started? What skills do I need to focus on first? What sources do I trust to learn all of this? Data science and machine learning are no different. While each field under the umbrella of data science has its own unique set of skills, there are a few basics that are universal. Here are the five skills you need to get started with data science and machine learning.
1. Linear Algebra
Time to bust out the high school and college textbooks again, because you’ll be needing algebra if you want to excel in data science. Linear algebra involves a lot of vectors and matrices, which are useful in representing large amounts of data – something you’ll see often in your life as a data scientist. Linear algebra is a core skill for deep learning, if you choose to go down that path.
2. Statistics & Probability
Statistics involves the collection, analysis, interpretation, presentation, and organization of data. Sound familiar? There are lots of similarities between statistics and data science, such as examining probability, bayesian thinking, experimental design, regression, and so on.
Uh oh, more math. While you may not need to go back and relearn everything about calculus from when you were 16, you need to understand the core concepts at least. This includes knowing more about gradient descent, linear regression, limits & derivatives, and so on.
4. Computer Science
Computer science has been around for quite some time, with a lot of theories and practices making their way over to data science. Many computer scientists make career transitions into data science, so there are plenty of parallels between the two. Core knowledge includes data structures, trees & graphs, lists & dictionaries, and more important skills.
5. A Coding Language
This is where it gets a bit fuzzy since there are debates about what coding language is best for data science. The most common two are Python & R, each with its own strengths and weaknesses. Python is versatile and often used in computer science as well, while R is popular for data analysis. There are many libraries, frameworks, and platforms that use either R or Python, so knowing one language won’t limit you.
Bonus Skills on How to Start Machine Learning: Communication and Business Knowledge
It’s not all numbers, charts, and graphs. The best data scientists will also know soft, non-technical skills in addition to their coding and programming toolkit when learning how to start machine learning. You’ll likely be working with a variety of people, so it’s important to know how to communicate across departments – including verbal communication and data presentation – as well as knowing some basics of business to understand what a customer or client may want.
Learn all of these core skills with the Ai+ Training Machine Learning Bootcamp
There’s a lot to learn here on how to start machine learning, and it may be difficult to know where to start. With the Machine Learning Fundamentals Bootcamp as part of Ai+ Training, you can learn all of these skills on-demand, at your own pace. Components include:
Linear Algebra for Machine Learning: This topic, Intro to Linear Algebra, is the first in the Machine Learning Foundations series. It is essential because linear algebra lies at the heart of most machine learning approaches and is especially predominant in deep learning, the branch of ML at the forefront of today’s artificial intelligence advances.
Calculus for Machine Learning: This topic, Calculus I: Limits & Derivatives, introduces the mathematical field of calculus — the study of rates of change — from the ground up. It is essential because computing derivatives via differentiation is the basis of optimizing most machine learning algorithms, including those used in deep learning.
Probability and Statistics: Probability & Information Theory introduces the mathematical fields that enable us to quantify uncertainty as well as to make predictions despite uncertainty. These fields are essential because ML algorithms are both trained by imperfect data and deployed into noisy, real-world scenarios.
Computer Science: This session, Algorithms & Data Structures, introduces the most important computer science topics for machine learning, enabling you to design and deploy computationally efficient data models.
About the instructor: Dr. Jon Krohn is the Chief Data Scientist at the machine learning company, untapt. He authored the 2019 book Deep Learning Illustrated, an instant #1 bestseller that was translated into six languages. Jon is renowned for his compelling lectures, which he offers in-person at Columbia University, New York University, and the NYC Data Science Academy. Jon holds a Ph.D. in Neuroscience from Oxford and has been publishing on machine learning In leading academic journals since 2010; his papers have been cited over a thousand times.