Warning: Invalid argument supplied for foreach() in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 95
Warning: array_merge(): Expected parameter 2 to be an array, null given in /home/customer/www/opendatascience.com/public_html/wp-includes/nav-menu.php on line 102
I was analytically betwixt and between a few weeks ago. Most of my Jupyter Notebook work is done in either Python or R. Indeed, I like to self-demonstrate the power of each platform by recoding R work in Python and vice-versa. I must have a dozen... Read more
Class imbalance is common in real-world datasets. For example, a dataset with examples of credit card fraud will often have exponentially more records of non-fraudulent activity than those of fraudulent cases. In many applications, training your model on imbalanced classes can inhibit model functionality if predictive... Read more
Logistic regression was once the most popular machine learning algorithm, but the advent of more accurate algorithms for classification such as support vector machines, random forest, and neural networks has induced some machine learning engineers to view logistic regression as obsolete. Though it may have been... Read more
For a data scientist without an eye for design, creating visualizations from scratch might be a difficult task. But as is the case with most problems, a solution awaits thanks to Python. Those drawn to using Python for data analysis have been spoiled, as more advanced... Read more
Let’s say you want to classify hundreds (or thousands) of documents based on their content and topics, or you wish to group together different images for some reason. Or what’s even more, let’s think you have that same data already classified but you want to challenge... Read more
Computer vision is a huge part of the data science/AI domain. Sometimes, computer vision engineers have to deal with videos. Here, we aim to shed light on video processing – using Python, of course. This might be obvious for some, but nevertheless, video streaming is not... Read more
For college basketball junkies like me, the season is now shifting into high gear as teams begin serious conference play. At the end of the regular season and conference tournaments, 66 D1 teams — 32 league champions and 34 at large — will receive invitations to... Read more
Key Takeaways: It’s important to describe missing data and the challenges it poses. You need to clarify a confusing terminology that further adds to the field’s complexity. You should take the time to review methods for handling missing data. You need to learn how to apply... Read more
In my previous post, we constructed a simple cross-validated regression model using Scikit-Learn in 35 lines. It’s pretty amazing that we can perform machine learning with so little effort, but we just did the bare minimum in order to get a working model. Frankly, it didn’t even... Read more
Scikit-Learn is one of the premier tools in the machine learning community, used by academics and industry professionals alike. At ODSC East 2019, Scikit-Learn author Andreas Mueller will host a training session to give beginners a crash course —this is your guide to scikit-learn. As one... Read more