How Generative AI Has Become a Must-Have Skill
Many fail to understand that generative AI isn’t just a subfield within data science. Thanks to an expanding list of tools, it has become a must-have skill. You may be asking yourself, why that’s the case? Or you may not even believe you don’t need generative... Read more
8 Trending and New Large Language Models to Keep an Eye On
We’re hearing a lot about large language models, or LLMs recently in the news. If you don’t know, LLMs are a type of artificial intelligence that is trained on massive amounts of text data. This allows them to generate text that is often indistinguishable from human-written... Read more
Where Generative AI Stands in Privacy and Security Today
Generative AI is an innovative technology that excels at creating something new from a set of inputs and has taken a bold step into the world of data. It’s a tool capable of generating realistic text, producing creative artwork, or simulating real-world scenarios. Today, its role... Read more
What Can AI Teach Us About Data Centers? Part 1: Overview and Technical Considerations
Conversational artificial intelligence has been around for almost 60 years now.  Its first application was developed at the Massachusetts Institute of Technology in 1966, well before the dawn of personal computers. The typical application familiar to readers is much more recent, when AI operates as chatbots,... Read more
Decision Trees From Scratch With Python
We already know a single decision tree can work surprisingly well. The idea of constructing a forest from individual trees seems like the natural next step. Today you’ll learn how the Random Forest classifier works and implement it from scratch in Python. This is the sixth of many... Read more
Area Under the Curve and Beyond with Integrated Discrimination Improvement and Net Reclassification
TLDR AUC is a good starting metric when comparing the performance of two models but it does not always tell the whole story NRI looks at the new models ability to correctly reclassify cancers and benigns and should be used alongside AUC IDI quantifies improvement of the slopes of... Read more
Bayesian Customer Lifetime Values Modeling using PyMC3
Customer lifetime value (CLV) is the total worth of a customer to a company over the length of their relationship. The collective CLV of a company’s customer base reflects its economic value and is often measured to evaluate its future prospects. While many ways to estimate... Read more
A Primer on Combinations and Permutations
Combinations and permutations are common throughout mathematics and statistics, hence are a useful concept that we data scientists should know. In this post, I want to discuss the difference between the two, the difference between the two, and also how one would calculate them for some given data.... Read more
How to Compute Sentence Similarity Using BERT and Word2Vec
We often need to encode text data, including words, sentences, or documents into high-dimensional vectors. The sentence embedding is an important step in various NLP tasks such as sentiment analysis and extractive summarization. A flexible sentence embedding library is needed to prototype fast and to tune for... Read more
7 Pitfalls to Avoid While Using Model-Agnostic Interpretation Techniques
Interpretable machine learning techniques are becoming more popular among the data science community as more and more complex machine learning algorithms are adopted which are not easily interpretable. Model-Agnostic Interpretation techniques do not care about the underlying models, but they have the capability to interpret the... Read more