Deep Learning as the apotheosis of Test-Driven Development
Even if you aren’t interested in data science, Deep Learning is an interesting programming paradigm; you can see it as “doing test-driven development with a ludicrously large number of tests, an IDE that writes most of the code, and a forgiving client.” No wonder everybody’s pouring so much money... Read more
Transfer Learning – Machine Learning’s Next Frontier
Table of contents: What is Transfer Learning? Why Transfer Learning Now? A Definition of Transfer Learning Transfer Learning Scenarios Applications of Transfer Learning Learning from simulations Adapting to new domains Transferring knowledge across languages Transfer Learning Methods Using pre-trained CNN features Learning domain-invariant representations Making representations more similar Confusing... Read more
Data Readiness Levels: Turning Data from Palid to Vivid
Application of models to data is fraught. You are faced with collaborators who sometimes have a very basic understanding of the complications of collating, processing and curating data. Challenges include: poor data collection practices, missing values, inconvenient storage mechanisms, intellectual property, security and privacy. All these aspects obstruct the... Read more
Cognitive Machine Learning (1): Learning to Explain
Above is an image of the Zaamenkomst panel: one of the best remaining exemplars of rock art from the San people of Southern Africa. As soon as you see it, you are inevitably herded, like the eland in the scene, through a series of thoughts. Does it have a meaning?  Why are the eland running?... Read more
Faster deep learning with GPUs and Theano
Originally posted by Manojit Nandi, Data Scientist at STEALTHbits Technologies on the Domino data science blog Domino recently added support for GPU instances. To celebrate this release, I will show you how to: Configure the Python library Theano to use the GPU for computation. Build and train neural networks... Read more
ftfy (fixes text for you) 4.4 and 5.0
ftfy is Luminoso’s open-source Unicode-fixing library for Python. Luminoso’s biggest open-source project is ConceptNet, but we also use this blog to provide updates on our other open-source projects. And among these projects, ftfy is certainly the most widely used. It solves a problem a lot of people have with... Read more
On word embeddings – Part 1
Table of contents: A brief history of word embeddings Word embedding models A note on language modelling Classic neural language model C&W model Word2Vec CBOW Skip-gram Unsupervisedly learned word embeddings have been exceptionally successful in many NLP tasks and are frequently seen as something akin to a silver bullet.... Read more
Dropout with Theano
Almost everyone working with Deep Learning would have heard a smattering about Dropout. Albiet a simple concept (introduced a couple of years ago), which sounds like a pretty obvious way for model averaging, further resulting into a more generalized and regularized Neural Net; still when you actually get into... Read more
Background – How many cats does it take to identify a Cat? In this article, I cover the 12 types of AI problems i.e. I address the question : in which scenarios should you use Artificial Intelligence (AI)?  We cover this space in the  Enterprise AI course Some background: Recently, I conducted... Read more
Learning the Monty Hall problem
As Wikipedia gives it Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No.... Read more