Practical Naive Bayes — Classification of Amazon Reviews
If you search around the internet looking for applying Naive Bayes classification on text, you’ll find a ton of articles that talk about the intuition behind the algorithm, maybe some slides from a lecture about the math and some notation behind it, and a bunch of articles I’m... Read more
In my last post, I did some natural language processing and sentiment analysis for Jane Austen’s most well-known novel, Pride and Prejudice. It was just so much fun that I wanted to extend some of that work and compare across her body of writing. I decided... Read more
Our research in 2016: personal scientific highlights
Year 2016 has been productive for science in my team. Here are some personal highlights: bridging artificial intelligence tools to human cognition, markers of neuropsychiatric conditions from brain activity at rest, algorithmic speedups for matrix factorization on huge datasets… Artificial-intelligence convolutional networks map well the human visual... Read more
How hard can it be to compute conversion rate? Take the total number of users that converted and divide them with the total number of users. Done. Except… it’s a lot more complicated when you have any sort of significant time lag. Prelude — a story Fresh out... Read more
Introduction Link to Part 1 Link to Part 2 In this post, we’ll go into summarizing a lot of the new and important developments in the field of computer vision and convolutional neural networks. We’ll look at some of the most important papers that have been... Read more
Predicting Resignation in the Military
In the 2015 hackathon organized by Singapore’s Ministry of Defense, one of the tasks was to predict resignation rates in the military, using anonymized data on 23,000 personnel which included their age, military rank, years in service, as well as performance indicators such as salary increments and... Read more
36 Questions to Ask Your Chatbot
Teaching the Art of Conversation to Chatbots Sure, you’ve asked Siri for a nearby restaurant. Maybe you’ve had M make a reservation for you. You may have even kicked back a beer with Untappd. But how well do you really know the bots in your life? Every day,... Read more
A Stochastic Gradient Descent Implementation in Clojure
Description of the problem Gradient Descent is an algorithm that finds local extremum points of a real valued function with several variables.  As such it is a go-to algorithm for many optimization problems that appear in the context of machine learning.  I wrote an implementation optimizing Linear Regression and Logistic Regression cost functions... Read more
Machine Learning: An In-Depth Guide – Data Selection, Preparation, and Modeling
Articles Overview, goals, learning types, and algorithms Data selection, preparation, and modeling Model evaluation, validation, complexity, and improvement Model performance and error analysis Unsupervised learning, related fields, and machine learning in practice Introduction Welcome to the second article in a five-part series about machine learning. In... Read more
Seven Python Kernels from Kaggle You Need to See Right Now
The ability to post and share kernels is probably my favorite thing about Kaggle. Learning from other users’ kernels has often provided inspiration for a number of my own projects. I also appreciate the attention to detail and descriptions provided by some users in their code... Read more