SIR model with deSolve & ggplot2
This is my first post ever and in 2017!  Since I am recent graduate and currently un-employed, my hope is to upload some interesting material on using R on a weekly basis. Making this an informative and motivational blog to share my interests and mini-projects in R. Now on to this... Read more
Group-By Modeling in R Made Easy
There are several aspects of the R language that make it hard to learn, and repeating a model for groups in a data set used to be one of them. Here I briefly describe R’s built-in approach, show a much easier one, then refer you to a new approach described... Read more
On Machine Learning and Programming Languages
This article was co-written by Mike Innes (Julia Computing), David Barber (UCL), Tim Besard (UGent), James Bradbury (Salesforce Research), Valentin Churavy (MIT), Simon Danisch (MIT), Alan Edelman (MIT), Stefan Karpinski (Julia Computing), Jon Malmaud (MIT), Jarrett Revels (MIT), Viral Shah (Julia Computing), Pontus Stenetorp (UCL) and Deniz Yuret (Koç... Read more
Word2Vec – the world of word vectors
Have you ever wondered how a chatbot can learn about the meaning of words in a text? Does this sound interesting? Well, in this blog we will describe a very powerful method, Word2Vec, that maps words to numbers (vectors) in order to easily capture and distinguish their meaning. We will briefly describe how Word2Vec works without going... Read more
Exploratory Data Analysis of Tropical Storms in R
Exploratory Data Analysis of Tropical Storms in R The disastrous impact of recent hurricanes, Harvey and Irma, generated a large influx of data within the online community. I was curious about the history of hurricanes and tropical storms so I found a data set on data.world and started some basic Exploratory... Read more
An example of web scraping with R: Online Food Blogs
In this blog post I will discuss web scraping using R. As an example, I will consider scraping data from online food blogs to construct a data set of recipes. This data set contains ingredients, a short description, nutritional information and user ratings. Then, I will provide a simple... Read more
Custom Level Coding in vtreat
One of the services that the R package vtreat provides is level coding (what we sometimes call impact coding): converting the levels of a categorical variable to a meaningful and concise single numeric variable, rather than coding them as indicator variables (AKA “one-hot encoding”). Level coding can be computationally and statistically preferable to one-hot encoding for... Read more
How to Perform the Principal Component Analysis in R
Implementing Principal Component Analysis (PCA) in R Give me six hours to chop down a tree and I will spend the first four sharpening the axe. —- Abraham Lincoln The above Abraham Lincoln quote has a great influence in the machine learning too. When it comes to modeling different... Read more
Seeking Guidance in Choosing and Evaluating R Packages
At useR!2017 in Brussels last month, I contributed to an organized sessionfocused on navigating the 11,000+ packages on CRAN. My collaborators on this session and I recently put together an overall summary of the session and our goals, and now I’d like to talk more about the specific issue of learning... Read more
matmul() is eating software
Last week Zak Stone from Google Brain gave a talk at South Park Commons where he wove together a bunch of threads that are shaping future machine learning progress: TensorFlow, XLA, Cloud TPUs, TFX, and TensorFlow Lite; he also hinted at even more exciting stuff not quite ready for public consumption. (Fun... Read more