Ticket prices for ODSC East increasing at 11 PM Friday.




Use code: ODSC20 for extra 20% Off
What Is Predictive Analytics (and Why Do You Need It)?

What Is Predictive A...

Try this statistic on for size: The 500 petabytes of digital healthcare data that existed in 2012 is predicted to reach 25,000 petabytes by the year 2020. That’s an increase of nearly 50 times the amount of data from just eight years prior! Healthcare marketers may be swimming in data, but what’s important is to […]

Amazon will make $41B this Holiday Season! Forecasting Quarterly Revenue

Amazon will make $41...

The holiday shopping season is in full swing! The economy is relatively strong compared to a few years back and so retail sales are probably going to be strong especially for amazon. Other retailers like Target and Wal-Mart are also running amazing black Friday and holiday sales to attract customers. However, amazon has consistently shown […]

Ad Hoc Distributed Random Forests #4

Ad Hoc Distributed R...

when arrays and dataframes aren’t flexible enough TL;DR. Dask.distributed lets you submit individual tasks to the cluster. We use this ability combined with Scikit Learn to train and run a distributed random forest on distributed tabular NYC Taxi data. Our machine learning model does not perform well, but we do learn how to execute ad-hoc […]

Predictive Modeling, Supervised Machine Learning, and Pattern Classification

Predictive Modeling,...

When I was working on my next pattern classification application, I realized that it might be worthwhile to take a step back and look at the big picture of pattern classification in order to put my previous topics into context and to provide and introduction for the future topics that are going to follow. Pattern […]

Predictive analytics is not enough

Predictive analytics...

The idea of predictive analytics can seem like magic: how, really, can a computer predict the future? Yet we’ve seen a lot of success based on this advanced technology in recent years, from Netflix to Amazon, Google, and more. These companies mine a massive amount of data every day for patterns, and it drives massive […]

Deutsch Credit Future Telling: part 2

Deutsch Credit Futur...

To continue on this first path, it’s logical to proceed with hyperparameter tuning on the three algorithms previously mentioned in part 1. Here the Random Forest Classifier (R.F.C) pulls ahead with 77% accuracy while the other two are still around 75%. Where there were three on this road, there is now one. The next step […]

Deutsch Credit Future Telling: part 1

Deutsch Credit Futur...

Classification tasks in Data Science come frequently, but the hardest are those with unbalanced classes. From biology to finance, the real-life situations are numerous. Before balancing your errors, establishing a baseline with the most frequent occurrence can give you over 90% accuracy right off the bat.  The question of whether it is worse to have […]

Win Customer Loyalty with Predictive Analytics

Win Customer Loyalty...

Winning your customer for life is a challenging task for organizations. How can you connect with your customer and how can you ensure that they stay with your organization for a long time? Questions that many organizations face.  Fortunately, with the advance of big data and analytics, it has become a little bit easier for […]

Prediction Machine Designed with Spark, Kudu, and Impala

Prediction Machine D...

This was originally posted on the Silicon Valley Data Science blog. Why should your infrastructure maintain a linear growth pattern when your business scales up and down during the day based on natural human cycles? There is an obvious need to maintain a steady baseline infrastructure to keep the lights on for your business, but […]