fbpx
5 Steps to Implementing a Data Literacy-Driven DataOps Framework
DataOps is a new framework that has been gathering greater attention in the past year since it first appeared on the Gartner Hype Cycle. DataOps is defined as a new way of thinking related to data which encompasses people, processes, and technology, resulting in improved collaboration and streamlined decision-making... Read more
Text Classification in Python
This article is the first of a series in which I will cover the whole process of developing a machine learning project. This one focuses on training a supervised learning text classification model in Python. The motivation behind writing these articles is the following: as a learning data scientist who has... Read more
Using Keras and TensorFlow in R
Keras and Tensorflow are two very powerful packages that are normally accessed via python. Since the packages were developed for python they may have the illusion of being out of reach for R users. However, this is not the case as the Keras and Tensorflow packages may be set... Read more
What is “Tidy Data”?
I would like to write a bit on the meaning and history of the phrase “tidy data.” Hadley Wickham has been promoting the term “tidy data.” For example in an eponymous paper, he wrote: In tidy data: Each variable... Read more
Discovering 135 Nights of Sleep with Data, Anomaly Detection, and Time Series
In this article, I look at data from 135 nights of sleep and use anomaly detection and time series data to understand the results. Three things are certain in life: death, taxes, and sleeping. Here, we’ll talk about the latest. Every night*, us humans, after a long day of... Read more
Using an Embedding Matrix on Tabular Data in R
How would you tackle the prospects of representing a categorical feature, with 100’s of levels, in a model? A first approach may be to create a one-hot encoded matrix representing each level of the feature. The result would be a large and sparse matrix where the majority of the... Read more
ODSC West 2019 Talks and Workshops to Expand and Apply R Skills
At this point, most of us know the basics of using and deploying R—maybe you took a class on it, maybe you participated in a hackathon. That’s all important (and we have tracks for getting started with Python if you’re not there yet), but once you have those baseline... Read more
R-Related Talks Coming to ODSC West 2019
R is one of the most commonly-used languages within data science, and its applications are always expanding. From the traditional use of data or predictive analysis, all the way to machine or deep learning, the uses of R will continue to grow and we’ll have to do everything we... Read more
Swift Versus Python: Common Features, Strengths, and Weaknesses
There are many popular languages, but not all of them remain popular year after year. Nevertheless, some languages don’t lose their popularity and become even more widespread. We discuss the debate around Swift versus Python. For example, PYPL statistics for 2018... Read more
How Can You Combine DevOps and Automation for Robust Security?
In this article, we will be taking a look at how the organizations can leverage the potential of DevOps and automation in order to evolve their business. As the engineering teams are trying to innovate at a quicker and... Read more