What Statistical Test Should I Do?
ModelingRTools & Languagesposted by Antoine Soetewey May 4, 2022
Being a teaching assistant in statistics for students with diverse backgrounds, I have the chance to see what is globally not well understood by students. I have realized that it is usually not a problem for students to do a specific statistical test when they are told which... Read more
How to Install R and RStudio
RTools & Languagesposted by Antoine Soetewey March 9, 2022
R is nothing more than a programming language. At the time of writing, this language is (one of) the leading program in statistics, although not the only programming language used by statisticians. In order to use R,... Read more
Tips and Tricks in RStudio and R Markdown
RTools & Languagesposted by Antoine Soetewey January 3, 2022
If you have the chance to work with an experienced programmer, you may be amazed by how fast she can write code. In this article, I share some tips and shortcuts you can use in RStudio and R Markdown to speed up the writing of your... Read more
How to Import an Excel File in RStudio
RTools & LanguagesExcelRposted by Antoine Soetewey October 6, 2021
As we have seen in this article on how to install R and RStudio, R is useful for many kinds of computational tasks and statistical analyses. However, it would not be so powerful and useful without the possibility to import datasets into R. As you will most... Read more
Retrieving Webpages Through Python Programming
ModelingRTools & LanguagesWeb Scrapingposted by ODSC Community July 22, 2020
This article discusses retrieving web pages through Python programming. The internet and the World Wide Web (WWW), is probably the most prominent source of information today. Most of that information is retrievable through HTTP. HTTP was invented originally to share pages of hypertext (hence th.e name Hypertext Transfer Protocol),... Read more
A Data Pattern with an R data.table Solution.
ModelingRTools & Languagesposted by Steve Miller April 17, 2020
Summary: This blog examines a loading pattern seen often with government-generated, web-accessible data. The data comprise millions of records across multiple text or csv files, generally demarcated by time. The files may present different, but overlapping, attributes, while much of the data has a character representation,... Read more
Data Manipulation in R
RTools & Languagesdata manipulationposted by Antoine Soetewey April 17, 2020
Not all datasets are as clean and tidy as you would expect. Therefore, after importing your dataset into RStudio, most of the time you will need to prepare it before performing any statistical analyses. Data manipulation can even sometimes take longer than the actual analyses when the... Read more
Guide to R and Python in a Single Jupyter Notebook
PythonRTools & Languagesjupyterposted by Matthew Stewart March 6, 2020
Why pick one when you can use both at the same time? R is primarily used for statistical analysis, while Python provides a more general approach to data science. R and Python are object-oriented towards data science for programming language. Learning both is an ideal solution.... Read more
PI and Simulation Art in R
RTools & LanguagesUncategorizedposted by Steve Miller February 25, 2020
I spent the better part of an afternoon last week perusing a set of old flash drives I’d made years ago for my monthly notebook backups. One that especially caught my attention had a folder of R scripts, probably at least 15 years old — harking... Read more
Beginner’s Guide to K-Nearest Neighbors in R: from Zero to Hero
Machine LearningModelingRTools & Languagesk-nearest neighborsposted by Leihua Ye February 10, 2020
In the world of Machine Learning, I find the K-Nearest Neighbors (KNN) classifier makes the most intuitive sense and easily accessible to beginners even without introducing any math notations. To decide the label of an observation, we look at its neighbors and assign the neighbors’ label... Read more