Choosing a Data Lake Format: What to Actually Look For
Recently we’ve seen lots of posts about a variety of different file formats for data lakes. There’s Delta Lake, Hudi, Iceberg, and QBeast, to name a few. It can be tough to keep track of all these data lake formats — let alone figure out why... Read more
Emily Webber of AWS on Pretraining Large Language Models
As newer fields emerge within data science and the research is still hard to grasp, sometimes it’s best to talk to the experts and pioneers of the field. Recently, we spoke with Emily Webber, Principal Machine Learning Specialist Solutions Architect at AWS. She’s the author of... Read more
Demystifying Machine Learning: Popular ML Libraries and Tools
As a senior data scientist, I often encounter aspiring data scientists eager to learn about machine learning (ML). It’s a fascinating field that can seem daunting at first, but I assure you, with the right mindset and resources, anyone can master it. In this comprehensive guide,... Read more
Why Owning Your Own LLM Model is Critical— and Within Reach
Large Language Models, or LLMs for short, have revolutionized various industries with their remarkable ability to answer questions, generate essays, and even compose lyrics. These powerful tools, such as OpenAI’s ChatGPT and Google’s Bard, have tremendous implications for sectors like financial services, retail, supply chains, and... Read more
Machine Learning Operations (MLOPs) with Azure Machine Learning
Machine Learning Operations (MLOps) can significantly accelerate how data scientists and ML engineers meet organizational needs. A well-implemented MLOps process not only expedites the transition from testing to production but also offers ownership, lineage, and historical data about ML artifacts used within the team. The data... Read more
Announcing Microsoft Azure’s New Tutorial on Deep Learning and NLP
Here at ODSC, we couldn’t be more excited to announce Microsoft Azure’s tutorial series on Deep Learning and NLP, now available for free on Ai+. This course series was created by a team of experts from the Microsoft community, who have brought their knowledge and experience... Read more
Strengthening Cybersecurity with Zero Trust Network Access (ZTNA) Integration
In today’s rapidly evolving cybersecurity landscape, organizations face increasingly sophisticated threats that challenge traditional security models. To counter these risks, a paradigm shift is underway toward a more robust and effective approach called Zero Trust Network Access (ZTNA). ZTNA has gained significant importance as a proactive... Read more
Building Dependable LLM Systems with Outlines
Modern large language models (LLMs) have impressive capabilities, but they can be challenging to integrate into complex workflows and systems, leading to unreliable results and unnecessary code duplication. Outlines, created by Rémi Louf at Normal Computing, offers a solution to these problems. Outlines enable the construction... Read more
5 Ethical Considerations for Generative AI
Even before the current AI boom, ethical concerns and considerations were always on the minds of data scientists. Now, with the growth of generative AI in the public imagination, these concerns have only exploded as different issues crop up when it comes to AI. While AI... Read more
Debug Object Detection Models with the Responsible AI Dashboard
At Microsoft Build 2023, we announced support for text and image data in the Azure Machine Learning responsible AI dashboard in preview. This blog will focus on the dashboard’s new vision insights capabilities, supporting debugging capabilities for object detection models. We’ll dive into a text-based scenario... Read more