While you wait for that to finish, can I interest you in parallel processing?
caret has been able to utilize parallel processing for some time (before it was on CRAN in October 2007) using slightly different versions of the package. Around September of 2011, caret started using the foreach package was used to “harmonize” the parallel processing technologies thanks to a super smart guy named Steve Weston.... Read more
My big obsession of 2018 so far is sports prediction platform Throne AI.  There’s no better way to describe than Kaggle for sports. The platform provides users with data with which they use to build models to predict the outcome of sports matches. Each league on... Read more
It’s been a couple of weeks since I got accepted in the closed beta testing programme for IBM Data Science Experience (DSX), and it is about time I share my thoughts on this offering.DSX is a new product, which IBM is positioning as a new generation Data Science... Read more
In a previous article, we discussed the origin story and history of the Python deep learning library TensorFlow. It’s experienced a monumental rise like nothing seen before, in just two years since its debut it currently holds the title of the most forked repo on GitHub.... Read more
You weren’t supposed to actually implement it, Google
Last month, I wrote a blog post warning about how, if you follow popular trends in NLP, you can easily accidentally make a classifier that is pretty racist. To demonstrate this, I included the very simple code, as a “cautionary tutorial.” The post got a fair amount... Read more
When I talk to young data science graduates, I often feel that they can train a deep learning model in 5 minutes, but have no idea where to go from there. After and before the model training and evaluation part, there is this big grey area... Read more
The ImageNet challenges play an important role in the development of computer vision. The great success of neural nets on ImageNet has contributed to general fervor around artificial intelligence. While the applied breakthroughs are real, issues with ImageNet and modern networks indicate gaps between current practice and intelligent perception. For... Read more
In this post, I’ll tell you how to geolocate your analysis using the Geopy. Geopy is a Python 2 and 3 library, that provides connections to the most popular geocoding services. Why bother to geolocate your data? Because if you use latitude and longitude data, you... Read more
Time series classification with Tensorflow
Time-series data arise in many fields including finance, signal processing, speech recognition and medicine. A standard approach to time-series problems usually requires manual engineering of features which can then be fed into a machine learning algorithm. Engineering of features generally requires some  domain knowledge of the... Read more
Julia 0.5 Highlights
To follow along with the examples in this blog post and run them live, you can go to JuliaBox, create a free login, and open the “Julia 0.5 Highlights” notebook under “What’s New in 0.5”. The notebook can also be downloaded from here. Julia 0.5 is... Read more