New Version of ggplot2
I just received a note from Hadley Wickham that a new version of ggplot2 is scheduled to be submitted to CRAN on June 25. Here’s what choroplethr users need to know about this new version of ggplot2. Choroplethr Update Required The new version of ggplot2 introduces bugs into choroplethr.... Read more
rqdatatable: rquery Powered by data.table
rquery is an R package for specifying data transforms using piped Codd-style operators. It has already shown great performance on PostgreSQL and Apache Spark. rqdatatable is a new package that supplies a screaming fast implementation of the rquery system in-memory using the data.table package. rquery is already one of the fastest and most teachable (due to deliberate conformity to Codd’s influential work) tools to wrangle data on databases and... Read more
WVPlots now at version 1.0.0 on CRAN!
Nina Zumel and I have been working on packaging our favorite graphing techniques in a more reusable way that emphasizes the analysis task at hand over the steps needed to produce a good visualization. We are excited to announce the WVPlots is now at version 1.0.0 on CRAN! The idea is: we sacrifice some of... Read more
wrapr 1.4.1 now up on CRAN
wrapr 1.4.1 is now available on CRAN. wrapr is a really neat R package both organizing, meta-programming, and debugging R code. This update generalizes the dot-pipe feature’s dot S3 features. Please give it a try! wrapr, is an R package that supplies powerful tools for writing and debugging R code. Introduction Primary wrapr services include: let() (let block) %.>% (dot arrow pipe) build_frame()/draw_frame()... Read more
Exploratory Data Analysis and Data Preparation with ‘funModeling’
funModeling quick-start This package contains a set of functions related to exploratory data analysis, data preparation, and model performance. It is used by people coming from business, research, and teaching (professors and students). funModeling is intimately related to the Data Science Live Book -Open Source- (2017) in the sense that most of... Read more


RTools & Languagesposted by Thomas Lumley May 16, 2018

I’m working on an R package for mixed models under complex sampling.  It’s here. At the moment, it only tries to fit two-level linear mixed models to two-stage samples – for example, if you sample schools then students within schools, and want a model with school-level random effects. Also, it’s... Read more
R Tip: Use Slices
R tip: use slices. R has a very powerful array slicing ability that allows for some very slick data processing. Suppose we have a data.frame “d“, and for every row where d$n_observations < 5 we wish to “NA-out” some other columns (mark them as not yet reliably available). Using slicing techniques this can be done quite quickly... Read more
Programming with Futures in R
This blog post is a deep dive into the future package in R. Futures are really useful when you want to kick off multiple jobs in parallel, or have long-running tasks run in the background. Another great use for futures is to make Shiny apps more responsive (like with the promises package).... Read more
magrittr and wrapr Pipes in R, an Examination
Let’s consider piping in R both using the magrittr package and using the wrapr package. magrittr pipelines The magittr pipe glyph “%>%” is the most popular piping symbol in R. magrittr documentation describes %>% as follow. Basic piping: x %>% f is equivalent to f(x) x %>% f(y) is equivalent to f(x, y) x %>% f %>% g %>% h is equivalent to h(g(f(x))) The argument placeholder x %>%... Read more
R Tip: Use match_order() to Align Data
R tip. Use wrapr::match_order() to align data. Suppose we have data in two data frames, and both of these data frames have common row-identifying columns called “idx“. library("wrapr") d1 <- build_frame( "idx", "x" | 3 , "a" | 1 , "b" | 2 , "c" ) d2 <- build_frame( "idx", "y" |... Read more