Cultivating a data-driven mindset is the first key step to becoming better at data science (i.e. being more acute to how data can be used to solve existing problems).
We can start with the following questions:
- What are some business or social challenges you have/ are facing? Or what is the KPI your business is trying to achieve?
- What data do we have? Where does the data come from? (We want to know if the data is going to be consistently available/ if we can rely on this source to ensure up-to-date information or analysis.)
- What can we do with the data we have? What data do we need? What kinds of analysis can we then do if we have those data (or else what can we only do?)
With that, once we have clarity of the analysis we need to do and get the relevant data on hand, we will be able to start improving our technical skills next. For example, if I had to use Python to do a cross-tabulation or if I had to prepare a statistical analysis on survey data, and I ran into questions relating to coding or the relevant/ appropriate statistical tests, I will get to learn how to perform these tasks (by Googling :)) and be able to do it thereafter. Cliché as it may sound, practice makes perfect. Depending on the business case, the technical work involved might differ. Hence, the more exposure to different kinds of data problems, the better one will be.
Being able to enjoy continuous learning and to improve data science skills is important as a data scientist (or whatever data science titles/ roles you name it). While this means that the work is challenging, it is gratifying to know that help is always around the corner – the data community on the internet.
For me, the community (to be more specific, there are actually many communities! eg. R, Python, Tableau, Data Science for Social Good, but yea, the data science community as a whole) has been essential in helping me grow. With many sharing their work and use cases, I can often find solutions readily available on Stack Overflow, data science blogs, cheatsheets, forums, or YouTube videos. Quoting Isaac Newtown, “If I have seen further than others, it is by standing upon the shoulders of giants.” Whenever we are unsure what to do, we can find references, perform a literature review to determine what are some of the appropriate approaches we can take.
There are several occasions where I had to search through multiple websites or multiple threads on Stack Overflow to solve a problem/ error that I ran into. Hence it is not uncommon to spend a lot of time debugging. Through such practical experiences, we get to learn about what works and what doesn’t work. I have started a data science blog for the exact reason to document my learning journey, taking notes of the steps I did and/ or errors that I ran into so I can refer back to them again should I encounter them again.
In short, to improve our data science skills, we can learn from others through attending conferences/ meetups and reading data science blogs/ whitepapers, and then practice on a real-world problem/ dataset! So, come attend my hands-on workshop on “Journey from Data analyst to (Citizen) Data Scientist” on 16 Sep 2021, 2:55 PM IST/5:25 PM SGT as part of the ODSC APAC Conference!
About the Author/ODSC APAC 2021 Speaker on improving data science skills:
Hui Xiang Chua is Senior Data Scientist at Dataiku, helping enterprises with data democratization and enabling them to build their own path to AI. Dataiku is a 2x Gartner Magic Quadrant Leader for Data Science and Machine-Learning Platforms (as of 2021). She has both public and private experiences solving problems using data, namely over six years in the public service and two years in the media industry. She was also previously an instructor with General Assembly.