10 Tips to Get Started with Kaggle
Kaggle is a well-known community website for data scientists to compete in machine learning challenges. Competitive machine learning can be a great way to hone your skills, as well as demonstrate your skills. In this article, I will provide 10 useful tips to get started with Kaggle and get... Read more
The Data Scientist’s Holy Grail – Labeled Data Sets
The Holy Grail for data scientists is the ability to obtain labeled data sets for the purpose of training a supervised machine learning algorithm. An algorithm’s ability to “learn” is based on training it using a labeled training set – having known response variable values that correspond to a... Read more
Mail Processing with Deep Learning: A Case Study
Businesses increasingly delegate simple, boring, and repetitive tasks to artificial intelligence. In a case study, Alexandre Hubert — lead data scientist of software company Dataiku’s U.K. operations — worked on a team of three to automate mail processing with deep learning. At ODSC Europe 2018, Hubert detailed how his team... Read more
Thomas Wiecki of Quantopian on ‘Minding the Gap’ Between Statistics and Machine Learning at ODSC Europe 2018
Key Takeaways: It’s important for data scientists to understand the so-called “gap” between statistics and machine learning, and how there actually is a lot of commonality between the two; it’s just a matter of how you look at things. PyMC3 is a very useful probabilistic programming framework for Python.... Read more
Active Learning: Your Model’s New Personal Trainer
First, some facts. Fact: active learning is not just another name for reinforcement learning; active learning is not a model; and no, active learning is not deep learning. What active learning is and why it may be an important component of your next machine learning project was the subject... Read more
Three Machine Learning Practices That Keep Your Identity Safe
As privacy concerns escalate in the age of big data, developers constantly evolve artificial intelligence and machine learning techniques to protect individuals’ identities. Machine learning systems enable businesses to more effectively identify fraud and keep user information safe. These systems gather data that can provide much more powerful insights... Read more
An Introduction to Reinforcement Learning Concepts
Individuals interested in reinforcement learning crowded into a room at ODSC Europe 2018. There, Badoo’s lead data scientist Leonardo De Marchi hosted a four-hour workshop to guide attendees through the first steps. What is reinforcement learning? Reinforcement learning is one machine learning approach. Most people know of supervised and unsupervised learning.... Read more
Client-side Web Development and Machine Learning
You might not expect client-side web development and machine learning to be in the same sentence. In this article, however, we’re going to look at how and why these two are beginning to collaborate rather successfully. There are many hidden uses for a collaboration between Javascript and machine learning.... Read more
Crash Course: Pool-Based Sampling in Active Learning
Active learning is a class of machine learning problems where labeled data isn’t available for supervised algorithms. Let’s take the classic setup as an example. Say we have pictures of birds and want to classify them by type, but the images don’t have labels for what kind of bird... Read more
Classic Regularization Techniques in Neural Networks
Neural networks are notoriously tricky to optimize. There isn’t a way to compute a global optimum for weight parameters, so we’re left fishing around in the dark for acceptable solutions while trying to ensure we don’t overfit the data. This is a quick overview of the most popular approaches... Read more