9 ways to Level up your Data Science practice
Data VisualizationData WranglingModelingPredictive AnalyticsToolsTools & LanguagesLearning data scienceposted by Domino Data Lab April 24, 2017
We love reading articles with tips and best practices, and we agree with a lot of the advice we see out there (#5 on this list is great!). So, we asked the Domino team for advice to pass on to researchers and scientists searching for ways to get to that next level, and here’s what we heard:
1. Learn ways to parallelize your code.
You already know about map-reduce frameworks like Hadoop and Spark, but these might be an overkill for the sizes of datasets you’re working with. Consider leveraging a machine with multiple cores, or a GPU, to dramatically speed up calculations on “medium size” data. You can use joblib in Python; in R, the ‘parallel’ package and ‘foreach’ package are great (there are many tutorials on the topic, here is one) for parallelization on CPUs.
Certain kinds of data science tasks can also benefit from GPUs. A lot of deep learning packages already use GPUs to massively accelerate the training process, but there’s a lot more potential there. Explore some libraries like BIDMach, gpustats, PyCUDA, and gputools, to accelerate the rest of your code.
2. Explore Flask alternatives for creating cool dashboards in Python.
Shiny is THE way for R users to create nifty dashboards and interactive visualizations. The de facto tool to do that in Python is Flask, but Flask is meant for much heavier lifting. The Python community has been at work. Check out Bokeh, Plotly, Jupyter dashboard, Pyxley, or Spyre to accelerate the delivery of your insights to your business users.
Here’s a list to help you get started.
4. Get plugged into a data science community.
Hackbright Academy and rOpenSci.
5. Compete in a Kaggle competition!
Kaggle is a stellar place to try out new tools, techniques and technologies (like Domino!). Find a Kaggle competition you like, preferably in a field different than your own and see what you can accomplish. Hang out on the forums, join a team, and most importantly have fun. Find out more on their website.
6. Clean up your Github and social profiles.
More often than not, people will Google you as soon as they meet you..probably while you are still standing there! Take a few minutes and clean up your Github, add a cool header to your Twitter profile, and make that first virtual impression strong. You never know what kinds of opportunities arise from having the right person find out that you’re the unicorn of their dreams.
7. Submit a talk at a conference or event.
Partially Derivative, Data Skeptic, and Becoming a Data Scientist. There’s some amazing content being created by data scientists for data scientists and available through your favorite podcast app.