Python or R—Or Both?
I was analytically betwixt and between a few weeks ago. Most of my Jupyter Notebook work is done in either Python or R. Indeed, I like to self-demonstrate the power of each platform by recoding R work in Python and vice-versa. I must have a dozen active notebooks, some... Read more
Strategies for Addressing Class Imbalance
Class imbalance is common in real-world datasets. For example, a dataset with examples of credit card fraud will often have exponentially more records of non-fraudulent activity than those of fraudulent cases. In many applications, training your model on imbalanced classes can inhibit model functionality if predictive accuracy for minority... Read more
Validating Type I and II Errors in A/B Tests in R
In the below work, we will intentionally leave out statistics theory and attempt to develop an intuitive sense of what type I(false-positive) and type II(false-negative) errors represent when comparing metrics in A/B tests. One of the problems plaguing the analysis of A/B tests today is known as the “peeking... Read more
Data Science + Design Thinking: a Perfect Blend to Achieve the Best User Experience
  It’s one thing to rely on artificial intelligence, machine learning, and big data to make your product smarter.  And, quite another to build a product that’s so intuitive and easy-to-use that your customer falls in love with it. That’s the beauty of data science + design thinking. It’s... Read more
The Benefits of Cloud Native ML And AI
As big data gets more complex, companies are struggling to accommodate the storage and computing needs of average organizations, much less massive enterprises. This is where cloud-native ML and AI comes into play. What Does Cloud Native Mean? Your computing power is limited. No matter what kind of hardware... Read more
Logistic Regression with Python
Logistic regression was once the most popular machine learning algorithm, but the advent of more accurate algorithms for classification such as support vector machines, random forest, and neural networks has induced some machine learning engineers to view logistic regression as obsolete. Though it may have been overshadowed by more... Read more
Creating Multiple Visualizations in a Single Python Notebook
For a data scientist without an eye for design, creating visualizations from scratch might be a difficult task. But as is the case with most problems, a solution awaits thanks to Python. Those drawn to using Python for data analysis have been spoiled, as more advanced libraries have made... Read more
What is TensorFlow?
It would be a challenge nowadays to find a machine learning engineer who has heard nothing about TensorFlow. Initially created by Google Brain team for some internal purposes, such as spam filtering on Gmail, it was open-sourced in 2015 and became the most popular deep learning framework in the... Read more
5 Mistakes You’re Making With DataOps
Data is the driver for just about every modern business, and as companies consume more data more intelligently, there’s a need for a better community and higher buy-in. DataOps stands to do to data what DevOps did to development.   [Related Article: Data Ops: Running ML Models in Production... Read more
10 Best Data Science Platforms
A data science platform can change the way you work. It’s more than just a tool, it’s a way to wrangle data and turn every member of your team into a high performing unit, capable of pivoting and scaling without missing a beat. The right one is transformative to... Read more