Keras Metrics: Everything You Need To Know
Keras metrics are functions that are used to evaluate the performance of your deep learning model. Choosing a good metric for your problem is usually a difficult task. you need to understand which metrics are already available in Keras and tf.keras and how to use them, in many situations you need... Read more
Causal Inference Using Synthetic Control: The Ultimate Guide
In other posts, I’ve explained what causation is and how to do causal inference using quasi-experimental designs (DID, ITS, RDD). Almost for all research methods, they have to meet two preconditions in order to generate meaningful insights: 1. the treated group looks like the control group (similarity for comparability); 2. a... Read more
Natural Language Processing Guide: 30 Free ODSC Resources to Learn NLP
NLP, as with many other topics within data science, involves many skills, tools, languages, frameworks, and more. The ODSC Guide to Natural Language Processing is our compendium of 20 free resources for you to get started with NLP, including videos from past ODSC NLP presentations, tutorials, articles, and more.... Read more
Has Progress in America Peaked?
Measuring Physical, Personal, and National Development over the past century I recently came across a compelling work of data visualization published by the New York Times. The author sought to prove a point that performance had peaked in men’s speed skating and more broadly in other Olympic sports too. [Related... Read more
What is Federated Learning?
The field of machine learning is constantly evolving, sometimes slowly, and at other times we experience the tech equivalent of the Cambrian Explosion with rapid advance that makes a good many data scientists experience a serious case of imposter syndrome. Take the case of a new iteration of machine... Read more
Call for Collaboration: Data Science and COVID-19 – Modeling and Future Assumptions
Update 3/20/2020: An initial conclusion has been made and is added at the bottom of the original article. Update 3/17/2020: This analysis & repository is ongoing. If you’d like to contribute to this open-source project, please email Alex (alex.l@odsc.com) and Ben (ben.vigoda@gamalon.com) to request access to the GitHub repository... Read more
Who Cares About Data Privacy?
In this  article, which is a 7-minute read, I will: explain why I am in data privacy, share some practical tips on how I recommend to manage privacy in a complex environment, and give an outlook on the legal topics which we will cover at our upcoming ODSC talk,... Read more
Simple Guide to Hyperparameter Tuning in Neural Networks
A step-by-step Jupyter notebook walkthrough on hyperparameter optimization. This is the fourth article in my series on fully connected (vanilla) neural networks. In this article, we will be optimizing a neural network and performing hyperparameter tuning in order to obtain a high-performing model on the Beale function—one of many test... Read more
Novelty in Machine Learning, or “What Gets Me Excited Every Day About Data Science”
Note: Kirk will present two training sessions at ODSC East 2020. One will focus on “Solving the Data Scientist’s Dilemma: the Cold-Start Problem with 10+ Machine Learning Examples” and the other will look at “Adapting Machine Learning Algorithms to Novel Use Cases.” I have always appreciated the unusual, unexpected,... Read more
Variational Auto-Encoders for Customer Insight
Github repository: VAEs-in-Economics Neural networks are sometimes perceived as super complicated. They’re not. The most attractive application, in my opinion, of neural networks for small and medium-sized businesses, is in customer segmentation, and in my upcoming workshop at ODSC East 2020, “Variational Auto-Encoders for Customer Insight,” I will show... Read more