How to Create Nonlinear Models with Data Projection
Linear models will take you far as a machine learning practitioner — much further than most rookies would expect. I previously wrote that complex approaches like neural nets are a great way to shoot yourself in the foot. That’s especially true if you lack the data to justify nonlinear... Read more
5 Essential Neural Network Algorithms
Data scientists use many different algorithms to train neural networks, and there are many variations of each. In this article, I will outline five algorithms that will give you a rounded understanding of how neural networks operate. I will start with an overview of how a neural network works,... Read more
Dimensionality Reduction with Principal Component Analysis, and a Mallet
Bigger isn’t always better. High-dimensional data can quickly become untenable based on resource constraints and what you plan to use it for. Data with 150 columns isn’t going to get you very far if you don’t have the computational space to analyze it, or if you don’t even know... Read more
The Importance of Processing Data the Right Way
There are so many different aspects of training a neural network that affect its performance. Many data scientists spend too much time thinking about learning rates, neuron structures, and epochs before actually using correctly optimized data. Without properly formatting data, your neural network will be useless, regardless of the... Read more
Feature Engineering with Forward and Backward Elimination
Sometimes when you fit models to test their predictive accuracy, you find that you’re dealing with too many predictors (feature variables). You can draw upon your domain knowledge, or that of an available domain expert, to reduce predictors until you only have those that will offer your model superior... Read more
AI-Identified Health Policies, Hate Speech Detection Among September Industry Research
September has been an impressive month for data science research. Here, we highlight a few innovative and explosive studies released on the arXiv research aggregator out of Cornell University Library. This research dives into some of the most important facets of data science today, including deep learning, machine learning,... Read more
Mine Like Amazon with Market Basket Analysis
Pattern mining is an incredibly simple but powerful technique for discovering cooccurrences in large datasets. The most common approach to find those patterns is Market Basket Analysis, which is frequently pointed out as the method Amazon leverages for their “users also purchased” feature. Of course, that’s a dramatic oversimplification.... Read more
Is Texas Hold’em Poker the Newest Game AI has Defeated?
Experimenting with games is an excellent way to better our understanding of deep learning. In recent years, AI has famously become the best player in a number of games such as Go, Backgammon, and Chess. However, all these games are known to have perfect information. For each player, nothing... Read more
How Security Agencies Use AI to Stop Crime
Crime is a social problem. It is therefore not the kind you might expect machine learning to solve, or even improve. Unlike the many problems that machine learning has already cracked, social problems tend to include more anomalies, inconsistencies, and unexplained results. However, most crime — from the perpetrator’s... Read more
Assessment Metrics for Clustering Algorithms
Assessing the quality of your model is one of the most important considerations when deploying any machine learning algorithm. For supervised learning problems, this is easy. There are already labels for every example, so the practitioner can test the model’s performance on a reserved evaluation set. We don’t have... Read more
Open Data Science - Your News Source for AI, Machine Learning & more