ODSC Europe Super Early Bird Sale!

This deal has timed out, but the next deal might just around the corner, or find a way to contact us about writing a blog and we'll talk. See you at ODSC East!

Get 75% Off until Friday at 11pm

xda: R package for exploratory data analysis

xda: R package for e...

This package contains several tools to perform initial exploratory analysis on any input dataset. It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any predictive modeling pipeline. This package can be used to get a […]

Millennials are Less Likely to Divorce

Millennials are Less...

Millennials are getting married later than previous generations, as I wrote about here.  But the ones who get married are no more likely to divorce during the first 10 years, and after that they might be substantially less likely to get divorced. The following figure shows estimates for the fraction of people who have not […]

Language pitch

Language pitch...

Here’s a fun analysis that I did of the pitch (aka. frequency) of various languages. Certain languages are simply pronounced with lower or higher pitch. Whether this is a feature of the language or more a cultural thing is a good question, but there are some substantial differences between languages. Hertz (or Hz, or s−1s−1), […]

Student Loans: a Subprime Time-bomb for the US Government?

Student Loans: a Sub...

Exploratory Data Analysis Visualization Project contributed by James Stebbins – Data Science Student in the NYC Data Science Academy Bootcamp Are Student Loans a Subprime Time-bomb for the US Government? There is overwhelming concern among politicians, professionals, and students that the current student loan market may be the next soaring hot air ballon primed to […]

Visualizing Professional Tennis Upsets: ATP 2012-2014 Men’s Singles Matches

Visualizing Professi...

Exploratory Data Analysis Visualization Project contributed by Tyler Knutson – Data Science Student in the NYC Data Science Academy Bootcamp Context Men’s professional tennis is unique in that despite the dominance of a select few competitors at the top of the ATP world rankings, upsets do occur regularly.  How dominant are these top players?  Consider […]

Inside Serial Killer Data: Part Two

Inside Serial Killer...

This is the second part of a two-part series on serial killer data. To read part one and to learn more about the origins of this data, check out part one here. One of the best things about this dataset is that it includes detailed information on the victims and not just the killers. The data […]

Inside Serial Killer Data: Part One

Inside Serial Killer...

Have you wondered about serial killer data? Have you asked yourself “What do the demographics of serial killers look like?” or “Are there correlations between certain types of killing methods and motivations for killing?” Well you’re in luck because we’ve gotten our hands on some juicy serial killer data featuring pretty much anything you’ve ever wanted […]

Visualizing the Relationship Between Infant Mortality Rates & Resource Availability

Visualizing the Rela...

Introduction We know that war and civil unrest account for a significant proportion of deaths every year, but how much can mortality rates be attributed to a simple lack of basic resources and amenities, and what relationship do mortality rates have with such factors? That’s what I set out to uncover using WorldBank data that […]

Beyond One-hot: an Exploration of Categorical Variables

Beyond One-hot: an E...

In machine learning, data is king. The algorithms and models used to make predictions with the data are important, and very interesting, but ML is still subject to the idea of garbage-in-garbage-out. With that in mind, let’s look at a little subset of those input data: categorical variables. Categorical variables (wiki) are those that represent a […]