R
Exploratory Data Analysis and Data Preparation with ‘funModeling’
funModeling quick-start This package contains a set of functions related to exploratory data analysis, data preparation, and model performance. It is used by people coming from business, research, and teaching (professors and students). funModeling is intimately related to the Data Science Live Book -Open Source- (2017) in the sense that most of... Read more
svylme

svylme

RTools & Languagesposted by Thomas Lumley May 16, 2018

I’m working on an R package for mixed models under complex sampling.  It’s here. At the moment, it only tries to fit two-level linear mixed models to two-stage samples – for example, if you sample schools then students within schools, and want a model with school-level random effects. Also, it’s... Read more
R Tip: Use Slices
R tip: use slices. R has a very powerful array slicing ability that allows for some very slick data processing. Suppose we have a data.frame “d“, and for every row where d$n_observations < 5 we wish to “NA-out” some other columns (mark them as not yet reliably available). Using slicing techniques this can be done quite quickly... Read more
Programming with Futures in R
This blog post is a deep dive into the future package in R. Futures are really useful when you want to kick off multiple jobs in parallel, or have long-running tasks run in the background. Another great use for futures is to make Shiny apps more responsive (like with the promises package).... Read more
magrittr and wrapr Pipes in R, an Examination
Let’s consider piping in R both using the magrittr package and using the wrapr package. magrittr pipelines The magittr pipe glyph “%>%” is the most popular piping symbol in R. magrittr documentation describes %>% as follow. Basic piping: x %>% f is equivalent to f(x) x %>% f(y) is equivalent to f(x, y) x %>% f %>% g %>% h is equivalent to h(g(f(x))) The argument placeholder x %>%... Read more
R Tip: Use match_order() to Align Data
R tip. Use wrapr::match_order() to align data. Suppose we have data in two data frames, and both of these data frames have common row-identifying columns called “idx“. library("wrapr") d1 <- build_frame( "idx", "x" | 3 , "a" | 1 , "b" | 2 , "c" ) d2 <- build_frame( "idx", "y" |... Read more
Laminar flow with ggplot2 and gganimate
Preface I’ve realized that all my previous posts were quite substantial in length and took quite a long time to create them. From this point forward I’ll be generating posts of shorter length (partially for my sanity and more for my impulsivity with ideas). A few of these posts won’t be... Read more
R Tip: Use let() to Re-Map Names
Another R tip. Need to replace a name in some R code or make R code re-usable? Use wrapr::let(). Here is an example involving dplyr. Let’s look at some example data: library("dplyr") library("wrapr") starwars %>% select(., name, homeworld, species) %>% head(.) # # A tibble: 6 x 3 # name homeworld species #... Read more
Using Excel for Data Entry
This article shows you how to enter data so that you can easily open in statistics packages such as R, SAS, SPSS, or jamovi (code or GUI steps below). Excel has some statistical analysis capabilities, but they often provide incorrect answers. For a comprehensive list of these limitations, see http://www.forecastingprinciples.com/paperpdf/McCullough.pdfand http://www.burns-stat.com/documents/tutorials/spreadsheet-addiction. Simple Data... Read more
R Tip: Use let() to Re-Map Names
Another Rtip. Need to replace a name in some R code or make R code re-usable? Use wrapr::let(). Here is an example involving dplyr. Let’s look at some example data: library("dplyr") library("wrapr") starwars %>% select(., name, homeworld, species) %>% head(.) # # A tibble: 6 x 3 # name homeworld species #... Read more