Modeling Classification Trees
Decision trees (DTs) are one of the most popular algorithms in machine learning: they are easy to visualize, highly interpretable, super flexible, and can be applied to both classification and regression problems. DTs predict the value of a target variable by learning simple decision rules inferred from the data... Read more
How To Build A Spam Classifier Using Decision Tree
In the realm of Supervised Learning, there are tons of classifiers, including Logistic Regressions (logit 101 and logit 102), LDA, Naive Bayes, SVM, KNN, Random Forest, Neural Networks, and so many more coming each day! The real question that all data scientists... Read more
RAPIDS Forest Inference Library: Prediction at 100 Million Rows per Second
Random forests (RF) and gradient-boosted decision trees (GBDTs) have become workhorse models of applied machine learning. XGBoost and LightGBM, popular packages implementing GBDT models, consistently rank among the most commonly used tools by data scientists on the Kaggle platform. We see similar interest in forest-based models in industry, where... Read more
The Complete Guide to Decision Trees (part 2)
(See part 1 here.) Now you may ask yourself: how do DTs know which features to select and how to split the data? To understand that, we need to get into some details. All DTs perform basically the same task: they examine all the attributes of the dataset to... Read more
The Complete Guide to Decision Trees (part 1)
In the beginning, learning Machine Learning (ML) can be intimidating. Terms like “Gradient Descent”, “Latent Dirichlet Allocation” or “Convolutional Layer” can scare lots of people. But there are friendly ways of getting into the discipline, and I think starting with this guide to decision trees is a wise decision.... Read more