Organizational Processes for Machine Learning Risk Management
In our ongoing series on Machine Learning Risk Management, we’ve embarked on a journey to unravel the critical elements that ensure the trustworthiness of Machine Learning (ML) systems. In our first installment, we delved into “Cultural Competencies for Machine Learning Risk Management,” exploring the human dimensions... Read more
RAG vs Finetuning — Which is the Best Tool to Boost Your LLM Application?
As the wave of interest in Large Language Models (LLMs) surges, many developers and organisations are busy building applications harnessing their power. However, when the pre-trained LLMs out of the box don’t perform as expected or hoped, the question on how to improve the performance of... Read more
Conditional Probability and Bayes’ Theorem Simply Explained
Conditional probability and Bayes’ theorem are fundamental ideas in statistics that even laymen have heard of. Bayes’ theorem also gives rise to a separate branch of statistics, namely Bayesian inference. In Data Science we mainly deal and work in a Frequentist world and so we are, in my opinion,... Read more
Evaluating Clustering in Machine Learning
Clustering has always been one of those topics that garnered my attention. Especially when I was first getting into the whole sphere of machine learning, unsupervised clustering always carried an allure with it for me. To put it simply, clustering is rather like the unsung knight... Read more
Python Timestamp: Converting and Formatting Essentials for Beginners
Do you know what’s the number one thing Junior and Senior Developers have in common? They both don’t know how to work with dates without referencing the manual. It’s just one thing that’s too difficult to remember for some reason. Well, not anymore! Python Timestamp plays... Read more
Data Engineering vs Machine Learning Pipelines
Data engineering and machine learning pipelines are both very different but oddly can feel very similar. Many ML engineers I have talked to in the past rely on tools like Airflow to deploy their batch models. So I wanted to discuss the difference between data engineering... Read more
How to Build a 5-Layer Data Stack
Like bean dip and ogres, layers are the building blocks of the modern data stack. Its powerful selection of tooling components combine to create a single synchronized and extensible data platform with each layer serving a unique function of the data pipeline. Unlike ogres, however, the cloud... Read more
How to Add Domain-Specific Knowledge to an LLM Based on Your Data
In recent months, Large Language Models (LLMs) have profoundly changed the way we work and interact with technology, and have proven to be helpful tools in various domains, serving as writing assistants, code generators, and even creative collaborators. Their ability to understand context, generate human-like text,... Read more
Top 10 Errors in R and How to Fix Them
If you are just starting with R, you will often encounter errors in your code which prevent it to run. I remember when I was just starting to use R, errors in my code were so frequent that I almost gave up learning this programming language.... Read more
An Overview of Meta’s Llama 2 Model: What’s New?
Over the course of the last few months, Meta’s Llama 2 has been making its rounds around the data science community and so far has proven why it has become a big deal. Not only has it pushed the envelope when it comes to LLMs, but... Read more