My big obsession of 2018 so far is sports prediction platform Throne AI.  There’s no better way to describe than Kaggle for sports. The platform provides users with data with which they use to build models to predict the outcome of sports matches. Each league on Throne AI counts... Read more
It’s been a couple of weeks since I got accepted in the closed beta testing programme for IBM Data Science Experience (DSX), and it is about time I share my thoughts on this offering.DSX is a new product, which IBM is positioning as a new generation Data Science development and training... Read more
In a previous article, we discussed the origin story and history of the Python deep learning library TensorFlow. It’s experienced a monumental rise like nothing seen before, in just two years since its debut it currently holds the title of the most forked repo on GitHub. TensorFlow’s significance doesn’t... Read more
You weren’t supposed to actually implement it, Google
Last month, I wrote a blog post warning about how, if you follow popular trends in NLP, you can easily accidentally make a classifier that is pretty racist. To demonstrate this, I included the very simple code, as a “cautionary tutorial.” The post got a fair amount of reaction. Much... Read more
When I talk to young data science graduates, I often feel that they can train a deep learning model in 5 minutes, but have no idea where to go from there. After and before the model training and evaluation part, there is this big grey area where ideas are... Read more
The ImageNet challenges play an important role in the development of computer vision. The great success of neural nets on ImageNet has contributed to general fervor around artificial intelligence. While the applied breakthroughs are real, issues with ImageNet and modern networks indicate gaps between current practice and intelligent perception. For this investigation I... Read more
In this post, I’ll tell you how to geolocate your analysis using the Geopy. Geopy is a Python 2 and 3 library, that provides connections to the most popular geocoding services. Why bother to geolocate your data? Because if you use latitude and longitude data, you can visualize your... Read more
Time series classification with Tensorflow
Time-series data arise in many fields including finance, signal processing, speech recognition and medicine. A standard approach to time-series problems usually requires manual engineering of features which can then be fed into a machine learning algorithm. Engineering of features generally requires some  domain knowledge of the discipline where the... Read more
Julia 0.5 Highlights
To follow along with the examples in this blog post and run them live, you can go to JuliaBox, create a free login, and open the “Julia 0.5 Highlights” notebook under “What’s New in 0.5”. The notebook can also be downloaded from here. Julia 0.5 is a pivotal release.... Read more
TensorFlow Clusters: Questions and Code
One way to think about TensorFlow is as a framework for distributed computing. I’ve suggested that TensorFlow is a distributed virtual machine. As such, it offers a lot of flexibility. TensorFlow also suggests some conventions that make writing programs for distributed computation tractable. When is there a cluster? A... Read more