Ticket prices for ODSC East increasing at 11 PM Friday.

days

:

:

Use code: ODSC20 for extra 20% Off
Go
Garbage In is Garbage Out; How Big Data Scientists Can Benefit from Human Judgment

Garbage In is Garbage Out; How Big Data Scientists Can Benefit fr...

The quality of your data determines the quality of your insights from that data. Of course, the quality of your data models and algorithms have an impact on your results as well, but in general it is garbage in, garbage out. Therefore, (Total) Data Quality Management (DQM) or Master Data Management (MDM) have been around for a very long time and ...

On Building a “Fake News” Classification Model *update

On Building a “Fake News” Classification Model *updat...

“A lie gets halfway around the world before the truth has a chance to get its pants on." - Winston Churchill Since the 2016 presidential election, one topic dominating political discourse is the issue of "Fake News". A number of political pundits claim that the rise of  significantly biased and/or untrue news influenced the election, ...

How the ODSC March Madness Bracket Fared Over the First Weekend

How the ODSC March Madness Bracket Fared Over the First Weekend

Last week we published an article about our March Madness bracket. In it we described the data science methods we used to predict the outcomes of all 63 games. Here's how our bracket did after the first two rounds and 48 games. Our bracket did pretty poorly. After a hugely promising start, in which we correctly predicted 15 out of 16 games on ...

Q&A with Andreas Mueller, ML vs DL

Q&A with Andreas Mueller, ML vs DL

Editors Note: Andreas Mueller is one of the core-contributors to scikit-learn, the open source python-based Machine Learning tool. Over the past few years, the buzz around scikit-learn grew significantly. Enterprises Spotify, Evernote and booking.com each use it. With its constantly improving user interface and computing speed we don't imagine ...

The Official Open Data Science March Madness Bracket

The Official Open Data Science March Madness Bracket

Today is the first day of the most exciting event in sports, that's right I'm talking about the NCAA Basketball Tourney aka "March Madness." And we here at Open Data Science have totally caught March Madness fever and have decided to try our hand at making a bracket of predictions. Since we're in the business of data science, we'll be ...

Maps and Sets can have Quadratic-time Performance

Maps and Sets can have Quadratic-time Performance

Swift is a new programming language launched by Apple slightly over two years ago. Like C and C++, it offers ahead-of-time compilation to native code but with many new modern features. It is available on Linux and macOS. Like C++, Swift comes complete with its own data structures like dictionaries (key-value or associative maps) and sets. It ...

The SPHERE Challenge: Activity Recognition with Multimodal Sensor Data

The SPHERE Challenge: Activity Recognition with Multimodal Sensor...

This is a guest post by Niall Twomey of the SPHERE project's machine learning team. You can download the notebook here to follow along. Enjoy! Welcome to the SPHERE Challenge! We are very excited to be working with DrivenData, the ECML-PKDD conference and the AARP foundation with this challenge! My name is Niall Twomey, and ...

How to Save Scikit-Learn Models with Python Pickle Library

How to Save Scikit-Learn Models with Python Pickle Library

Save the trained scikit learn models with Python Pickle The final and the most exciting phase in the journey of solving the data science problems is how well the trained model is performing over the test dataset or in the production phase. In some case, the trained model results outperform than our expectation. Sometimes the trained model ...

Predict Flight Delays with Apache Spark MLLib, FlightStats, and Weather Data

Predict Flight Delays with Apache Spark MLLib, FlightStats, and W...

Flight delays are an inconvenience. Wouldn’t it be great to predict how likely a flight is to be delayed? You could remove uncertainty and let travelers plan ahead. Usually, the weather is to blame for delays. So I’ve crafted an analytics solution based on weather data and past flight performance. This solution takes weather info from Weather ...