

How to Make an Animated Gif Fit for /r/dataisbeautiful
Data VisualizationModelingGifmatplotlibPandasposted by ODSC Community October 16, 2020 ODSC Community

A good visualization should capture the interest of the audience and make an impression. Few things capture interest more than bright colors and movement. In this post, I’m going to show you exactly how to make an animated gif, so that you can go farm some internet points on /r/dataisbeautiful, maybe.
Here’s what we’re going to make:
Step 0 – Data for making an animated gif
Before you make a graph, you’ve gotta get your hands on some data. I grabbed some business data from StatsCanada available here. The data isn’t in the best shape, so here’s a pinch of pandas to make it suck less:
import pandas as pd
df = pd.read_csv("3310027001-noSymbol.csv", skiprows=7).iloc[1:5]
df = df.rename(columns={"Business dynamics measure": "status"})
df['status'] = df['status'].apply(lambda x: x[:-13])
df = pd.melt(df, id_vars="status", var_name="date", value_name="count")
df["date"] = pd.to_datetime(df["date"], format="%B %Y")
df['count'] = df['count'].apply(lambda x: int(x.replace(",", "")))
print(df.head())
# status date count
# 0 Active 2015-01-01 775497
# 1 Opening 2015-01-01 40213
# 2 Continuing 2015-01-01 731116
# 3 Closing 2015-01-01 30979
# 4 Active 2015-02-01 778554
Step 1 – Graph
If you want to make an animated gif, you first have to make a single frame. Which, coincidentally, is just a graph:
from matplotlib import pyplot as plt
da = df[df["status"] == "Active"]
plt.plot(da["date"], da["count"])
Step 2 – Size
The graph could be bigger, and the y-axis limits could be adjusted. No problem, that’s just two extra lines of code: some code to size and limit:
plt.figure(figsize=(8, 5), dpi=300)
plt.plot(da["date"], da["count"])
plt.ylim([0, da['count'].max() * 1.1])
Step 3 – Tick
I like to manually set the ticks on my graphs, you don’t have to, but if you want to:
ymax = int(da['count'].max() * 1.1 // 1)
plt.figure(figsize=(8, 5), dpi=300)
plt.plot(da["date"], da["count"])
plt.ylim([0, ymax])
plt.yticks(range(0, ymax, 200_000))
Step 4 – Label
If someone saw our graph right now, without any context, they’d have no idea what’s going on. Let’s fix that by adding some labels:
plt.figure(figsize=(8, 5), dpi=300)
plt.plot(da["date"], da["count"])
plt.ylim([0, ymax])
plt.yticks(range(0, ymax, 200_000))
plt.title("Active Businesses in Canada (Seasonally Adjusted)")
plt.xlabel("Year")
plt.ylabel("Count")
Step 4.5 – Detour
Our graph is exclusively about “Active” businesses in Canada. Here’s what the “Opening” and “Closing” numbers look like:
dc = df[df["status"] == "Closing"]
do = df[df["status"] == "Opening"]
plt.plot(dc["date"], dc["count"], color='red', label="closing")
plt.plot(do["date"], do["count"], color='green', label="opening")
plt.legend()
Step 5 – Combine to make an animated gif
The “Opening and Closing” graph adds some interesting color to the “Active” data. Let’s combine both with some fancy-pants matplotlib:
rows = 7
figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
nrows=rows,
ncols=1,
wspace=0,
hspace=0.5,
figure=figure
)
main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])
main.plot(da["date"], da["count"])
sub.plot(do["date"], do["count"])
sub.plot(dc["date"], dc["count"])
Step 6 – Colour
I’m not keen on the colors or spacing of what we have right now. To fix, along with some axis adjustments, here’s what you’ll need:
figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
nrows=rows,
ncols=1,
wspace=0,
hspace=0.75,
figure=figure
)
main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])
main.plot(da["date"], da["count"], color="purple")
sub.plot(do["date"], do["count"], color="blue")
sub.plot(dc["date"], dc["count"], color="red")
main.set_xticks([])
main.set_ylim([0, ymax])
main.set_yticks(range(0, ymax, 200_000))
main.set_yticklabels([0, "200K", "400K", "600K", "800K\nbusinesses"])
sub.set_ylim([0, 110_000])
sub.set_yticks([0, 100_000])
sub.set_yticklabels([0, "100K"])
Step 7 – Refactor
Our graph code is nearly ready to go. We just need to refactor it so that we can take an individual date and build an individual frame for that date. I’ve also added some vlines
and fixed the xlim
s to improve legibility and ensure that the plotting space is consistent across plots:
date = pd.Timestamp("2019-08-01")
xmin = df['date'].min()
xmax = df['date'].max()
dd = df[df["date"] <= date]
dc = dd[dd["status"] == "Closing"]
do = dd[dd["status"] == "Opening"]
da = dd[dd["status"] == "Active"]
figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
nrows=rows,
ncols=1,
wspace=0,
hspace=1.25,
figure=figure
)
main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])
main.plot(da["date"], da["count"], color="#457b9d")
main.vlines(date, ymin=0, ymax=1e20, color="#000000")
sub.plot(do["date"], do["count"], color="#a8dadc")
sub.plot(dc["date"], dc["count"], color="#e63946")
sub.vlines(date, ymin=0, ymax=1e20, color="#000000")
main.set_xlim([xmin, xmax])
main.set_xticks([])
main.set_ylim([0, ymax])
main.set_yticks(range(0, ymax, 200_000))
main.set_yticklabels([0, "200K", "400K", "600K", "800K"])
main.set_title("Active Businesses in Canada")
sub.set_xlim([xmin, xmax])
sub.set_xticks([date])
sub.set_xticklabels([date.strftime("%B '%y")])
sub.set_ylim([0, 110_000])
sub.set_yticks([0, 100_000])
sub.set_yticklabels([0, "100K"])
sub.set_title("Businesses Opening and Closing")
Step 7.5 – Functionize to make an animated gif
In order to build a bunch of frames on a bunch of dates, we should wrap our code in a function:
def plot(date):
dd = df[df["date"] <= date]
dc = dd[dd["status"] == "Closing"]
do = dd[dd["status"] == "Opening"]
da = dd[dd["status"] == "Active"]
figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
nrows=rows,
ncols=1,
wspace=0,
hspace=1.25,
figure=figure
)
main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])
main.plot(da["date"], da["count"], color="#457b9d")
main.vlines(date, ymin=0, ymax=1e20, color="#000000")
sub.plot(do["date"], do["count"], color="#a8dadc")
sub.plot(dc["date"], dc["count"], color="#e63946")
sub.vlines(date, ymin=0, ymax=1e20, color="#000000")
main.set_xlim([xmin, xmax])
main.set_xticks([])
main.set_ylim([0, ymax])
main.set_yticks(range(0, ymax, 200_000))
main.set_yticklabels([0, "200K", "400K", "600K", "800K"])
main.set_title("Active Businesses in Canada")
sub.set_xlim([xmin, xmax])
sub.set_xticks([date])
sub.set_xticklabels([date.strftime("%b '%y")])
sub.set_ylim([0, 110_000])
sub.set_yticks([0, 100_000])
sub.set_yticklabels([0, "100K"])
sub.set_title("Businesses Opening and Closing");
So that we can build a frame with just one call:
plot(pd.Timestamp("2017-06-01"))
Step 8 – import gif
To turn static frames into an animated gif, all we have to do now is to install and import the gif package:
import gif
Decorate the plot
function with gif.frame
:
@gif.frame
def plot(date):
dd = df[df["date"] <= date]
dc = dd[dd["status"] == "Closing"]
do = dd[dd["status"] == "Opening"]
da = dd[dd["status"] == "Active"]
figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
nrows=7,
ncols=1,
wspace=0,
hspace=1.25,
figure=figure
)
main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])
main.plot(da["date"], da["count"], color="#457b9d")
main.vlines(date, ymin=0, ymax=1e20, color="#000000")
sub.plot(do["date"], do["count"], color="#a8dadc")
sub.plot(dc["date"], dc["count"], color="#e63946")
sub.vlines(date, ymin=0, ymax=1e20, color="#000000")
main.set_xlim([xmin, xmax])
main.set_xticks([])
main.set_ylim([0, ymax])
main.set_yticks(range(0, ymax, 200_000))
main.set_yticklabels([0, "200K", "400K", "600K", "800K"])
main.set_title("Active Businesses in Canada")
sub.set_xlim([xmin, xmax])
sub.set_xticks([date])
sub.set_xticklabels([date.strftime("%b '%y")])
sub.set_ylim([0, 110_000])
sub.set_yticks([0, 100_000])
sub.set_yticklabels([0, "100K"])
sub.set_title("Businesses Opening and Closing");
Build all the frames:
dates = pd.date_range(df['date'].min(), df['date'].max(), freq="1MS")
frames = [plot(date) for date in dates]
And save the animation to disk:
gif.save(frames, "businesses.gif", duration=5, unit="s", between="startend")
Now it’s your turn to find some interesting data and turn it into a gif.
The path to a job in data science may vary. With the Ai+ Training Platform, you gain access to our massive library of data science training courses, workshops, keynotes, and talks. All skills are ideal for those looking to break into the field or to acquire the latest skills needed to get ahead. Some highlighted courses include:
SQL for Data Science: Mona Khalil | Senior Data Scientist | Greenhouse
Data Science in the Industry: Continuous Delivery for Machine Learning with Open-Source Tools: Team from ThoughtWorks, Inc.
How to do Data Science with Missing Data: Matt Brems | Managing Partner, Distinguished Faculty | BetaVector, General Assembly
Continuously Deployed Machine Learning: Max Humber | Lead Instructor | General Assembly