Millennials are Less Likely to Divorce

Millennials are getting married later than previous generations, as I wrote about here.  But the ones who get married are no more likely to divorce during the first 10 years, and after that they might be substantially less likely to get divorced. The following figure shows estimates for the fraction of people who have not […]

Language pitch

Here’s a fun analysis that I did of the pitch (aka. frequency) of various languages. Certain languages are simply pronounced with lower or higher pitch. Whether this is a feature of the language or more a cultural thing is a good question, but there are some substantial differences between languages. Hertz (or Hz, or s−1s−1), […]

Student Loans: a Subprime Time-bomb for the US Government?

Exploratory Data Analysis Visualization Project contributed by James Stebbins – Data Science Student in the NYC Data Science Academy Bootcamp Are Student Loans a Subprime Time-bomb for the US Government? There is overwhelming concern among politicians, professionals, and students that the current student loan market may be the next soaring hot air ballon primed to […]

Visualizing Professional Tennis Upsets: ATP 2012-2014 Men’s Singles Matches

Exploratory Data Analysis Visualization Project contributed by Tyler Knutson – Data Science Student in the NYC Data Science Academy Bootcamp Context Men’s professional tennis is unique in that despite the dominance of a select few competitors at the top of the ATP world rankings, upsets do occur regularly.  How dominant are these top players?  Consider […]

Inside Serial Killer Data: Part Two

This is the second part of a two-part series on serial killer data. To read part one and to learn more about the origins of this data, check out part one here. One of the best things about this dataset is that it includes detailed information on the victims and not just the killers. The data […]

Inside Serial Killer Data: Part One

Have you wondered about serial killer data? Have you asked yourself “What do the demographics of serial killers look like?” or “Are there correlations between certain types of killing methods and motivations for killing?” Well you’re in luck because we’ve gotten our hands on some juicy serial killer data featuring pretty much anything you’ve ever wanted […]

Visualizing the Relationship Between Infant Mortality Rates & Resource Availability

Introduction We know that war and civil unrest account for a significant proportion of deaths every year, but how much can mortality rates be attributed to a simple lack of basic resources and amenities, and what relationship do mortality rates have with such factors? That’s what I set out to uncover using WorldBank data that […]

Beyond One-hot: an Exploration of Categorical Variables

In machine learning, data is king. The algorithms and models used to make predictions with the data are important, and very interesting, but ML is still subject to the idea of garbage-in-garbage-out. With that in mind, let’s look at a little subset of those input data: categorical variables. Categorical variables (wiki) are those that represent a […]

The Pressure Cooker: Population Density and Crime

Do Higher Population Densities Increase Crime? Crime, particularly violent crime, is always prevalent in the public consciousness. At the same time, the UN reported in 2014 that population densities and the prevalence of urban areas continue to increase, with more than half the world’s population living in urban areas for the first time in history. The […]