Choroplethr v3.6.0 is now on CRAN
Choroplethr version 3.6.0 is now on CRAN. This version adds functionality for getting and mapping demographics of US Census Tracts. You can install it from the R console as follows: 1 2 3 install.packages("choroplethr") packageVersion("choroplethr") ‘3.6.0’ To use this functionality you will need an API key from the... Read more
Exploring the Relationship between Religion and Demographics in R
Today’s guest post is by Julia Silge. Take a look at her work on (“Mapping US Religion Adherence by County in R“) where she demonstrated how to work with US religion adherence data in R. In this post she explores the relationship between that dataset and US Demographic data. I... Read more
Fixing an infelicity in ‘leaps’
The ‘leaps’ package for R is ancient – this is its tenth twentieth year on CRAN.  It uses old Fortran code by the Australian computational statistician Alan Miller. The Fortran 90 versions are on the web, but Fortran 90 compilation with R wasn’t portable back then, so I used the older... Read more
shinyHeatmaply – a shiny app for creating interactive cluster heatmaps
My friend Jonathan Sidi and I (Tal Galili) are pleased to announce the release of shinyHeatmaply (0.1.0): a new Shiny application (and Shiny gadget) for creating interactive cluster heatmaps. shinyHeatmaply is based on the heatmaply R package which strives to make it easy as possible to create interactive cluster... Read more
Useful Functions in R
I have listed some useful functions below: with() The with( ) function applys an expression to a dataset. It is similar to DATA= in SAS. # with(data, expression) # example applying a t-test to a data frame mydata with(mydata, t.test(y ~ group)) Please look at other examples here and... Read more
ftfy (fixes text for you) 4.4 and 5.0
ftfy is Luminoso’s open-source Unicode-fixing library for Python. Luminoso’s biggest open-source project is ConceptNet, but we also use this blog to provide updates on our other open-source projects. And among these projects, ftfy is certainly the most widely used. It solves a problem a lot of people have with... Read more
xda: R package for exploratory data analysis
This package contains several tools to perform initial exploratory analysis on any input dataset. It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any predictive modeling pipeline. This package... Read more
Within soccer’s nascent analytics movement, one metric dominates most discussions. It’s called Expected Goals or xG. Models for calculating xG differ, but the underlying concept is the same. In a nutshell, xG takes a shot’s characteristics – distance from goal, angle from goal, root cause, etc. – and assigns... Read more
Over time, Python and R have established themselves as the leading languages for Data Science. The rise of both has not been frictionless, though, as the two communities have ‘clashed’ over philosophical differences as each side recruits Data Science newcomers. R users will recommend that R is the better... Read more
Data science is an interdisciplinary endeavor, and it serves the purpose of extracting insight from varying sources of information. Various communities come together at Data Science Conferences to share their knowledge and promote innovation. It is not surprising, then, that the tools showcased by data scientists at ODSC East... Read more