Everybody’s favorite show about bloody power struggles and dragons, Game of Thrones, is back for its seventh season. And since we’re such big GoT fans here, we just had to do a project on analyzing data from the hit HBO show. You might not expect it, but the show is rife with data and has been the subject of various data projects from data scientists, who we all know love to combine their data powers with the hobbies and interests.
Milan Janosov of the Central European University devised a machine learning algorithm to predict the death of certain characters. A handy tool, for any fan tired of being surprised by the shock murders of the show. Dr. Allen Downey, author of the popular ThinkStats textbooks conducted a Bayesian analysis of the characters’ survival rate in the show. Data Scientist and biologist Shirin Glander applied social network analysis tools to analyze and visualize the family and house relationships of the characters.
The project we did is quite similar to that of Glander’s, we’ll be playing around with network analysis, but with data on the murderers and their victims. We constructed a giant network that maps out every murder of character’s with minor, recurring, and major roles.
The data comes courtesy of Ændrew Rininsland of The Financial Times, who’s done a great of collecting, cleaning, and formatting the data. For the purposes of this project, I had to do a whole lot of wrangling and cleaning of my own and in addition to my subjective decisions about which characters to include as well and what constitutes a murder. My finalized dataset produced a total of of 240 murders from 79 killers. For my network graph, the data produced a total of 225 nodes and 173 edges.
The particular graph in this network is directed graph, which denotes an one-way relationship. As you can imagine, that would be the most suitable type of network for this data in which one character murders another.
Before we get into the network analysis, let’s go over some basic stats from the deaths data.
The gold medal for kills goes to Jon Snow, with Arya Stark claiming silver and The Hound earning bronze. The majority of characters who died were minor and the rate of death seesawed between seasons until season 6 last year when it shot up by 125% from season 5. I guess that means the closer we get to winter, the more lives will be lost.
The Game of Thrones Murder Network
Here is the Game of Thrones murder network in all its glory. I recommend zooming in or opening the image in a new window to get better feel and understanding for the relationships for the data.
Each relationship in the network is presented as a line and at one end of the line is a thicker liner overlaid on it. The thick part of the line is adjacent to the victim. That is networkx’s way of letting you know the direction of the relationship. You can grab a better look at this in the following subgraphs.
This network is displayed in the spring layout format, but let’s see how it looks in the random layout.
Bit a messy isn’t it? Try to see if you can untangle network to discover interesting relationship in the data.
To dial the messiness back, let’s focus in on at the murder networks for the top three murderers.
Much better right?
Let us know in the comments if you see anything that catches your eye and if you’d like to get your hands on the data that we used to create this graph, it’s posted here in this github repo.
I'm a journalist turned data scientist/journalist hybrid. Looking for opportunities in data science and/or journalism. Impossibly curious and passionate about learning new things. Before completing the Metis Data Science Bootcamp, I worked as a freelance journalist in San Francisco for Vice, Salon, SF Weekly, San Francisco Magazine, and more. I've referred to myself as a 'Swiss-Army knife' journalist and have written about a variety of topics ranging from tech to music to politics. Before getting into journalism, I graduated from Occidental College with a Bachelor of Arts in Economics. I chose to do the Metis Data Science Bootcamp to pursue my goal of using data science in journalism, which inspired me to focus my final project on being able to better understand the problem of police-related violence in America. Here is the repo with my code and presentation for my final project: https://github.com/GeorgeMcIntire/metis_final_project.