We wrote a lot in 2021. Over 400 articles, actually. These top data science blogs and tutorials, written by the ODSC staff, ODSC event speakers, freelancers, our partners, and contributors from around the world all help make the OpenDataScience community what it is. We want to showcase the data science industry as a whole, which means all topics under the umbrella of artificial intelligence, research from all types of universities, contributors of all backgrounds, and everything open source.

There were ten articles that got the most attention from our community, and there’s no common theme. They answer different questions, showcase different tools, and represent different industries. Here they are!

1. How to Pivot and Plot Data With Pandas | Stefanie Molin

In this article, we will discuss how to create a pivot table of aggregated data in order to make a stacked bar visualization of the 2019 airline market share for the top 10 destination cities.

2. Building a Robust Data Pipeline with the “dAG Stack”: dbt, Airflow, and Great Expectations | Sam Bail

This blog post explains why data validation is crucial for data teams, provides a brief introduction to Great Expectations, and more.

3. Best Deep Learning Research of 2021 So Far | Daniel Gutierrez

2021 has been a great year for deep learning research, including topics like deep reinforcement learning, training deep neural networks, and others.

4. How Big Data Analytics are Used in the Banking Industry | Shannon Flynn

Big data analytics allows banks to examine large sets of data to find patterns in customer behavior and preferences. What can AI do?

5. Show Me the Data: 8 Awesome Time Series Sources | Sheamus McGovern

Working with time series datasets is a fantastic way to start exploring new data without collecting your own. Here are 8 sources to get started.


6. The Warmup Guide to OpenAI Gym | ODSC Team

Get started with the OpenAI Gym here with this free downloadable guide from ODSC. This includes key terminology, code, installation tips, and more.

7. How to Load Big Data from Snowflake Into Python | Saturn Cloud

The snowflake-connector-python package makes it fast and easy to write a Snowflake query and pull it into a pandas DataFrame.

8. Optimizing PyTorch Performance: Batch Size with PyTorch Profiler | Sabrina Smai

This tutorial demonstrates a few features of  PyTorch Profiler that have been released in v1.9.

9. The Rapid Evolution of the Canonical Stack for Machine Learning | Daniel Jeffries

The canonical stack for machine learning has evolved greatly in just six months. See its progress here and how you can implement it yourself.

10. Top 10 Skills for Data Engineers in 2021 | ODSC Team

From Python and SQL to Spark and Hadoop, these are the skills you need to become a data engineer in 2021.

How to Make Next Year’s Top Data Science Blogs List

That’s a pretty good list, and we’re sure next year’s top data science blogs will be just as impressive. If you want to get your work out there and have your tutorials, projects, thought leadership, and other technical know-how be seen by thousands of data science enthusiasts, then consider writing for OpenDataScience.com! You can learn more about the process here.

