fbpx
R, as I’ve pointed out before, has a package discovery problem. There’s a new package, colorblindr, which lets you see the impact of various sorts of colour-blindness on a colour palette, a very useful thing for designing good graphics. When it’s mentioned on Twitter, you see lots... Read more
Happy, Healthy, Hungry. Mapping San Francisco Restaurant Cleanliness
Somewhat recently, Yelp announced that it is partnering with Code for America and the City of San Francisco to develop LIVES, an open data standard which allows municipalities to publish restaurant inspection data in a standardized format. This is a step towards allows a much much... Read more
Tendencies of Data Engineers and Scientists
A long time ago I wrote a short post on the differences between data engineers and data scientists. My reasoning back then was that a data engineer is someone who applies engineering methodologies to data problems, while a data scientist is someone who applies the scientific method... Read more
This is the second post in a two-part series that discusses healthcare predictive and propensity modeling and selecting the optimal analytics partner to support your growth and engagement efforts. The first post in this series shares five best practices in healthcare propensity modeling. In our last post, we... Read more
Which customers are more likely to respond to banks’ marketing campaigns?
A quick demonstration on business consulting with data science Audience The intended audience for this blog post is marketers who have read the earlier post on 5-step data science consulting framework, and are keen to learn more about the actual implementation of such projects. We will be using the caret package in... Read more
This is the first post of a series of three articles in which we will discuss tips and guidelines for successful data science implementations. This post goes over the things you should worry about before to write the first line of code. A high level data... Read more
Professor John Kelleher discusses recurrent neural networks and conversational AI
Voice translate assistants like Google Home, Siri, Alexa and other similar platforms are now commonplace.  However, for the most part, these device are limited to question and answer  type exchanges and not conversational.  The next big focus for machine translation is dialog systems that go beyond... Read more
Text Analysis in Excel: Real world use-cases
Last month, we launched an Excel add-in, a solution for using ParallelDots NLP APIs to do text analysis on unstructured data without writing a single line of code. The Excel add-in is very easy to use and provides a convenient, yet effective solution for your text analysis... Read more
This is a presentation given for Data Science DC on Tuesday November 14, 2017. PDF slides PPTX slides Further resources up front: A Brief Survey of Deep Reinforcement Learning (paper) Karpathy’s Pong from Pixels (blog post) Reinforcement Learning: An Introduction (textbook) David Silver’s course (videos and slides) Deep Reinforcement Learning Bootcamp (videos, slides, and labs) OpenAI gym / baselines (software)... Read more
Making a machine learning model usually takes a lot of crying, pain, feature engineering, suffering, training, debugging, validation, desperation, testing and a little bit of agony due to the infinite pain. After all that, we deploy the model and use it to make predictions for future data. We can run our... Read more