Data Ops: Running ML Models in Production the Right Way
Editor’s note: Check out Ido’s talk at ODSC East 2019 this May, “From Zero to Airflow: Bootstrapping Into a Best-in-Class Risk Analytics Platform.” The tipping point Many organizations reach the point in which new goals for SLA, scale, or efficiency simply exceed the capabilities of their existing data / ML... Read more
7 Tips for Visual Search at Scale
Visual search is a rapidly emerging trend that is ideal for retail segments, such as fashion and home design, because they are largely driven by visual content, and style is often difficult to describe using text search alone. It is a topic of interest to data scientists, as evidenced... Read more
Monthly Summary of Selected Trends, Activities and Insights for R – December 2018
In December, activities across the R ecosystem reduced from levels observed in November. This was notable in StackOverflow, meetup events, and in the downloads of R packages. The December holidays likely caused this general reduction in activities. However, the first two weeks in December saw great activity in meetup... Read more
The 5 Biggest Debates in Data Science Today
The meteoric rise of data science in recent years is not without controversy. There are a number of on-going debates in the discipline that have gone unresolved for quite some time. The short list below contains the most common debates I routinely see discussed online, at conferences, and even... Read more
The Beginners Guide for Video Processing with OpenCV
Computer vision is a huge part of the data science/AI domain. Sometimes, computer vision engineers have to deal with videos. Here, we aim to shed light on video processing – using Python, of course. This might be obvious for some, but nevertheless, video streaming is not a continuous process,... Read more
Which Conference is Best? — College Hoops, Net Rankings and Python
For college basketball junkies like me, the season is now shifting into high gear as teams begin serious conference play. At the end of the regular season and conference tournaments, 66 D1 teams — 32 league champions and 34 at large — will receive invitations to March’s national championship... Read more
6 Reasons Why Data Science Projects Fail
In this article, I will dig deep into my years of experience as a tech journalist and practicing data scientist and reflect on numerous conversations I’ve had with companies about their data science projects in order to identify what I’ve seen as the top reasons why many projects fail.... Read more
Monthly Summary of Selected Trends, Activities, and Insights for R – November 2018
In November, activities continued to increase beyond the numbers recorded since July across the R ecosystem. This was most notable in events and in the downloads of R packages. Total package downloads from a single CRAN mirror and in one single year hit half-billion this November for the first... Read more
The Data Scientist’s Holy Grail – Labeled Data Sets
The Holy Grail for data scientists is the ability to obtain labeled data sets for the purpose of training a supervised machine learning algorithm. An algorithm’s ability to “learn” is based on training it using a labeled training set – having known response variable values that correspond to a... Read more
A Practical Approach to Data Ethics
There is a Golden Rule in life. It’s a maxim that appears in various forms around the world: One should never do that to another which one regards as injurious to one’s own self. As a data scientist, I find this principle of reciprocity very appealing! Treat others’ data... Read more