fbpx
Optimizing Hyperparameters for Random Forest Algorithms in scikit-learn
Optimizing hyperparameters for machine learning models is a key step in making accurate predictions. Hyperparameters define characteristics of the model that can impact model accuracy and computational efficiency. They are typically set prior to fitting the model to the data. In contrast, parameters are values estimated... Read more
Transforming Skewed Data for Machine Learning
Skewed data is common in data science; skew is the degree of distortion from a normal distribution. For example, below is a plot of the house prices from Kaggle’s House Price Competition that is right skewed, meaning there are a minority of very large values. Why... Read more
Essential Machine Learning with Linear Models in RAPIDS: Part 1 of a Series
This blog is the first in a series about regression analysis in RAPIDS, an open GPU data science platform. There are many varieties of regression techniques, and we’re working to include them all in RAPIDS. In this blog edition, I use Ordinary Least Squares (OLS) and... Read more
Using RAPIDS with PyTorch
In this post we take a look at how to use cuDF, the RAPIDS dataframe library, to do some of the preprocessing steps required to get the mortgage data in a format that PyTorch can process so that we can explore the performance of deep learning on... Read more
The Empirical Derivation of the Bayesian Formula
Deep learning has been made practical through modern computing power, but it is not the only technique benefiting from this large increase in power. Bayesian inference is up and coming technique whose recent progress is powered by the increase in computing power. We can explain the... Read more
Using Auto-sklearn for More Efficient Model Training
Applying a machine learning algorithm to any number of data-related tasks can be an enormous time saver, but the variable factors associated with creating an algorithm can be daunting. One must consider a variety of design-related decisions, and the risks surrounding the creation of an accurate... Read more
Strategies for Addressing Class Imbalance
Class imbalance is common in real-world datasets. For example, a dataset with examples of credit card fraud will often have exponentially more records of non-fraudulent activity than those of fraudulent cases. In many applications, training your model on imbalanced classes can inhibit model functionality if predictive... Read more
Building Your First Bayesian Model in R
Bayesian models offer a method for making probabilistic predictions about the state of the world. Key advantages over a frequentist framework include the ability to incorporate prior information into the analysis, estimate missing values along with parameter values, and make statements about the probability of a... Read more
The Best Machine Learning Research of 2019 So Far
The uses of machine learning are expanding rapidly. Already in 2019, significant research has been done in exploring new vistas for the use of this technology. Gathered below is a list of some of the most exciting research that has been undertaken in the realm of... Read more
Watch: Challenges and Opportunities in Applying Machine Learning
There are many opportunities in applying machine learning, whether as an individual developer or in a business. But how do you get started? This talk provides an overview that separates fact from fiction and proposes processes to find opportunities for applying ML. This includes understanding where... Read more