fbpx
R
Balancing Interpretability and Predictive Power with Cubist Models in R
Machine learning models are powerful tools that do well in their purpose of prediction. In many business applications, the power of these models is quite beneficial. With any application of a machine learning model, the process to choosing which model involves determining the model that performs best across a... Read more
Using Keras and TensorFlow in R
Keras and Tensorflow are two very powerful packages that are normally accessed via python. Since the packages were developed for python they may have the illusion of being out of reach for R users. However, this is not the case as the Keras and Tensorflow packages may be set... Read more
What is “Tidy Data”?
I would like to write a bit on the meaning and history of the phrase “tidy data.” Hadley Wickham has been promoting the term “tidy data.” For example in an eponymous paper, he wrote: In tidy data: Each variable... Read more
Discovering 135 Nights of Sleep with Data, Anomaly Detection, and Time Series
In this article, I look at data from 135 nights of sleep and use anomaly detection and time series data to understand the results. Three things are certain in life: death, taxes, and sleeping. Here, we’ll talk about the latest. Every night*, us humans, after a long day of... Read more
Using an Embedding Matrix on Tabular Data in R
How would you tackle the prospects of representing a categorical feature, with 100’s of levels, in a model? A first approach may be to create a one-hot encoded matrix representing each level of the feature. The result would be a large and sparse matrix where the majority of the... Read more
ODSC West 2019 Talks and Workshops to Expand and Apply R Skills
At this point, most of us know the basics of using and deploying R—maybe you took a class on it, maybe you participated in a hackathon. That’s all important (and we have tracks for getting started with Python if you’re not there yet), but once you have those baseline... Read more
R-Related Talks Coming to ODSC West 2019
R is one of the most commonly-used languages within data science, and its applications are always expanding. From the traditional use of data or predictive analysis, all the way to machine or deep learning, the uses of R will continue to grow and we’ll have to do everything we... Read more
Data-Driven Exploration of the R User Community Worldwide
Authors: Benaiah Ubah, Claudia Vitolo, and Rick Pack R is a programming language and environment for statistical computing and data visualization. An important component of the R ecosystem is its powerful user community, which has continued to expand around the world over the years. In a previous blog post we... Read more
Timing the Same Algorithm in R, Python, and C++
While developing the RcppDynProg R package I took a little extra time to port the core algorithm from C++ to both R and Python. This means I can time the exact same algorithm implemented nearly identically in each of these three languages. So I can extract some comparative “apples to apples” timings. Please read on for a... Read more
Where is Data Science Heading? Watching R’s Most Popular Packages May Have the Answer
Working as both a journalist and data scientist, I’m in a unique position to report on new tools of the profession as well as use them. I’m always seeking out trends surrounding the arrival of said tools because I feel they speak closely to the evolution of the field.... Read more