Factors in R

Factors in R

RTools & Languagesposted by Steve Miller August 9, 2019

The factor is a foundational data type in R. Factors are generally used to represent categorical variables, which may be intrinsically unordered (nominal) or ordered (ordinal). While the underlying data is often character, factors can be built on numerics as well. Factor variables are stored as integers pointing to unique values of underlying... Read more
Deep Learning in R with Keras
The primary professional hat I wear is as a data science consultant working with machine learning in a variety of problem domains. Due to my academic past in computer science and applied statistics, my development environment of choice today is typically R. Lately however, Python is taking the lead... Read more
Jupyter Notebook: Python or R—Or Both?
I was analytically betwixt and between a few weeks ago. Most of my Jupyter Notebook work is done in either Python or R. Indeed, I like to self-demonstrate the power of each platform by recoding R work in Python and vice-versa. I must have a dozen active notebooks, some... Read more
Validating Type I and II Errors in A/B Tests in R
In the below work, we will intentionally leave out statistics theory and attempt to develop an intuitive sense of what type I(false-positive) and type II(false-negative) errors represent when comparing metrics in A/B tests. One of the problems plaguing the analysis of A/B tests today is known as the “peeking... Read more
Introduction to R Shiny
Alyssa is a speaker for ODSC East 2019 this April 30 to May 3! Attend her talk “Data Visualization with R Shiny.” What is R Shiny? Shiny is an R package that enables you to build interactive web apps using both the statistical power of R and the interactivity... Read more
Activities and Insights for R: Monthly Summary of Selected Trends – December 2018
In December, activities across the R ecosystem reduced from levels observed in November. This was notable in StackOverflow, meetup events, and in the downloads of R packages. The December holidays likely caused this general reduction in activities. However, the first two weeks in December saw great activity in meetup... Read more
Monthly Summary of Selected Trends, Activities, and Insights for R – November 2018
In November, activities continued to increase beyond the numbers recorded since July across the R ecosystem. This was most notable in events and in the downloads of R packages. Total package downloads from a single CRAN mirror and in one single year hit half-billion this November for the first... Read more
Monthly Summary of Selected R Trends, Activities and Insights – October 2018
In October, the spike in activities observed in September across the R ecosystem was maintained. In the following article, a summary of selected R trends, activities, and insights in October, 2018, are presented as the R language keeps trending. Data for the trends and activities summarized here were obtained... Read more
How Tidyverse Guides R Programmers Through Data Science Workflows
Whenever someone asks me how to get into data science using R, I invariably recommend checking out the tidyverse package. Tidyverse is a great launch pad for a language like R because it offers order and consistency. I studied programming language design as a CS undergrad. At the time,... Read more
Build a Multi-Class Support Vector Machine in R
Support Vector Machines (SVMs) are quite popular in the data science community. Data scientists often use SVMs for classification tasks, and they tend to perform well in a variety of problem domains. An SVM performs classification tasks by constructing hyperplanes in a multidimensional space that separates cases of different... Read more