New Version of ggplot2
I just received a note from Hadley Wickham that a new version of ggplot2 is scheduled to be submitted to CRAN on June 25. Here’s what choroplethr users need to know about this new version of ggplot2. Choroplethr Update Required The new version of ggplot2 introduces bugs into choroplethr.... Read more
Emojis, Java and Strings
Emojis are funny characters that are becoming increasingly popular. However, they are probably not as simple as you might thing when you are a programmer. For a basis of comparison, let me try to use them in Python 3. I define a string that includes emojis, and then I... Read more
Category Encoders V1.2.8 Release
Been a while since a release, but category encoders has continued to advance with the help of lots of great contributors. I’ve just released v1.2.8, with primarily bugfixes, as well as some new features allowing a user to optionally add the category names in the output column names of... Read more
rqdatatable: rquery Powered by data.table
rquery is an R package for specifying data transforms using piped Codd-style operators. It has already shown great performance on PostgreSQL and Apache Spark. rqdatatable is a new package that supplies a screaming fast implementation of the rquery system in-memory using the data.table package. rquery is already one of the fastest and most teachable (due to deliberate conformity to Codd’s influential work) tools to wrangle data on databases and... Read more
Beyond Numpy Arrays in Python: Preparing the ecosystem for GPU, distributed, and sparse arrays
Executive Summary In recent years Python’s array computing ecosystem has grown organically to support GPUs, sparse, and distributed arrays. This is wonderful and a great example of the growth that can occur in decentralized open source development. However to solidify this growth and apply it across the ecosystem we... Read more
Intelligently Assisted Form Fields with Henosis
Filling Out Forms Isn’t Fun Online forms are the worst. The often-long, sometimes multi-page forms can be a time-consuming and laborious process to fill out. Almost any other task is more enjoyable, even with the occasional prize drawing or other form of incentive. While large forms can and often do... Read more
WVPlots now at version 1.0.0 on CRAN!
Nina Zumel and I have been working on packaging our favorite graphing techniques in a more reusable way that emphasizes the analysis task at hand over the steps needed to produce a good visualization. We are excited to announce the WVPlots is now at version 1.0.0 on CRAN! The idea is: we sacrifice some of... Read more
SQL on Hadoop, BigQuery, or Exadata. Please don’t call them MPP.
I often hear people referring to SQL engines running against HDFS or object storage as MPP. Strictly speaking this is incorrect. Let me first explain what an MPP database is and then explain why engines such as Presto etc. should not be called an MPP engine. MPP In an... Read more
wrapr 1.4.1 now up on CRAN
wrapr 1.4.1 is now available on CRAN. wrapr is a really neat R package both organizing, meta-programming, and debugging R code. This update generalizes the dot-pipe feature’s dot S3 features. Please give it a try! wrapr, is an R package that supplies powerful tools for writing and debugging R code. Introduction Primary wrapr services include: let() (let block) %.>% (dot arrow pipe) build_frame()/draw_frame()... Read more
A Review of Qualtrics, QuestionPro, REDCap, SurveyGizmo, & SurveyMonkey
Introduction Web-based surveys offer a quick and effective way to collect data. Several companies sell software-as-a-service which makes the construction of surveys quite easy using only a web browser. At the University of Tennessee, we currently have a system-wide site license for Qualtrics. Initial discussions suggested an intent from... Read more
Open Data Science - Your News Source for AI, Machine Learning & more