How Entertainment and Social Media Giants are Using Machine Learning
Major names in social media didn’t get there by accident. In addition to their excellent products, marketing, and sales strategies, machine learning is a huge part of the backbone that makes many of their processes successful. Facebook, Twitter, among other names you’ve definitely heard of have become the powerhouses... Read more
Exploring Scikit-Learn Further: The Bells and Whistles of Preprocessing
In my previous post, we constructed a simple cross-validated regression model using Scikit-Learn in 35 lines. It’s pretty amazing that we can perform machine learning with so little effort, but we just did the bare minimum in order to get a working model. Frankly, it didn’t even perform that well.... Read more
Three Ways Researchers are Using Data Science for Good
Data experts have long identified marginalization and narrow-minded problem solving as some of the biggest challenges facing data science. When large technology enterprises only seek solutions to problems they face within their company and their communities, it exacerbates inequalities. But companies, nonprofits, and individuals across the globe are making... Read more
What Model Should I Choose for My Data Science Project?
What to ask yourself when you’re balancing model performance, interpretability, and other costs It might seem silly to bother doing anything other than build the best black box machine learning model possible, as long as it gets good performance. That makes perfect sense on personal projects and Kaggle competitions. But it’s an... Read more
K-Means Clustering Applied to GIS Data
GIS can be intimidating to data scientists who haven’t tried it before, especially when it comes to analytics. On its face, mapmaking seems like a huge undertaking. Plus esoteric lingo and strange datafile encodings can create a significant barrier to entry for newbies. There’s a reason why there are experts who... Read more
Assessment Metrics for Clustering Algorithms
Assessing the quality of your model is one of the most important considerations when deploying any machine learning algorithm. For supervised learning problems, this is easy. There are already labels for every example, so the practitioner can test the model’s performance on a reserved evaluation set. We don’t have... Read more
Three Challenges for Open Data Science
There are three types of lies: lies, damned lies, and ‘big data.’ That’s the message Amazon machine learning director Neil Lawrence began his ODSC Europe 2016 lecture with before laying out the three largest challenges for open data science and our data-centered society. As Lawrence sees it, those challenges... Read more
Comparing Features of 4 Popular Machine Learning Platforms
Machine learning, the term and the technology, has been of paramount importance and relevance in the context of computational applications for years. Arthur Samuel first came up with the word “machine learning” in 1957. Machine learning is basically a part of artificial intelligence that evolves through the fields of... Read more
The Potential of Communities and Machine Learning for Good
For any technology to be successful, it needs to move from the early adopter market segment to the majority, i.e., crossing the chasm. Up until now, machine learning has been primarily in the hype phase and adoption has been mostly driven by the early adopters and innovators. I envision that... Read more
Machine Learning for Beginners – a How-to Guide
Addressing the audience at Open Data Science Conference 2017 in Boston, Kirill Eremenko and Hadelin de Ponteves stepped listeners through a collection of different machine learning techniques spanning a wide breadth, explaining the basics behind each to get users off the ground. This is a great tutorial for getting... Read more