ODSC’s AI Weekly Recap: Week of March 1st ODSC’s AI Weekly Recap: Week of March 1st
Open Data Science Blog Recap Paris-based Mistral AI is emerging as a formidable challenger to industry giants like OpenAI and Anthropic.... ODSC’s AI Weekly Recap: Week of March 1st

Open Data Science Blog Recap

Paris-based Mistral AI is emerging as a formidable challenger to industry giants like OpenAI and Anthropic. (Source)

Texas A&M has joined the Artificial Intelligence Safety Institute Consortium (AISIC), focusing on AI safety and reliability (Source)

The Swiss National Science Foundation (SNSF) has set a stance on its position concerning the deployment of artificial intelligence technologies by researchers seeking its funding. (Source)

Google CEO Sundar Pichai told employees in a note that the model was producing “biased” and “completely unacceptable” responses and the company pledges to fix Gemini (Source)

OpenAI sought dismissal of a lawsuit filed by the New York Times, challenging the accusation that its generative AI system, ChatGPT, was manipulated to produce results allegedly infringing on copyrighted material. (Source)

AI News Highlights

SEC probes whether OpenAI investors were misled and is scrutinizing internal communications by OpenAI CEO Sam Altman as part of an investigation

VCs Are Betting Robots Will Help Build Your Next Home. Over the past few years, investors have poured hundreds of millions into startups at the intersection of construction and robotics. The startup funding comes amid a period of widespread construction labor shortages alongside rising building costs.

X-energy, a startup developing small modular reactors (SMRs), sees an attractive market for energy as the artificial intelligence boom grows thirst for new sources of power.

AI has already transformed business in a way not seen since the industrial revolution and some worry about its impact.

Though AI has brought tremendous value to business, it has also created a series of new paradoxes according to Forbes.

The banking sector is expected to see a great deal of change to its day according to a new study.

In a shift toward AI, Apple shutters its electric car program.

After the launch of OpenAI’s Sora, can Hollywood adapt to the artificial intelligence ‘matrix’?

AI is now challenging how architects design data centers to handle more power and cooling demands.

In-Person and Virtual Conference

April 23rd to 25th, 2024

Join us for a deep dive into the latest data science and AI trends, tools, and techniques, from LLMs to data analytics and from machine learning to responsible AI.


Trending AI Open Source Projects

  • minbpe is an open-source project from famed researcher Andrej Karpathy and provides minimal, clean code for the ( Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization. The repository includes two tokenizers that can train the tokenizer vocabulary and merge on a given text
  • V-JEPA, is a collection of vision models trained solely using a feature prediction objective, without the use of pretrained image encoders, text, negative examples, reconstruction, or other sources of supervision.
  • Neural Flow is a Python script for plotting the intermediate layer outputs of Mistral 7B.
  • Auto Prompt is a prompt optimization framework designed to enhance and perfect your prompts for real-world use cases and automatically generates high-quality, detailed prompts tailored to user intentions. It employs a refinement (calibration) process, where it iteratively builds a dataset of challenging edge cases and optimizes the prompt accordingly.
  • GRIT trains large language models to handle both generative and embedding tasks by distinguishing between them through instructions.

Research From Around The Globe

  • Gemma – a set of open models from Google inspired by the research used to develop Gemini, Google’s most powerful closed-source AI model to date. Two sizes – 2B parameter model: Trained on 2 trillion tokens of text data. 7B parameter model: Trained on 6 trillion tokens of text data. Each size comes in two versions – Base model: Pre-trained on a general dataset, but not fine-tuned for specific tasks. Instruction-tuned model: Fine-tuned to perform specific tasks based on additional instructions. – Paper
  • Large Language Models for Data Annotation: A Survey: This paper explores the application of Large Language Models (LLMs) in data annotation and includes: Taxonomy, LLM-based annotation (the process of employing LLMs to directly generate annotations), assessment, and also explores how models can learn effectively from data annotated with the help of LLMs. – Paper
  • BootsTAP: Bootstrapped Training for Tracking-Any-Point- Paper
  • A framework for evaluating clinical artificial intelligence systems without ground-truth annotations – Paper
  • Neural Network Diffusion. Diffusion models have achieved remarkable success in image and video generation. This paper illustrates how diffusion models can generate high-performing neural network parameters using an autoencoder and a standard latent diffusion model, consistently producing models of comparable or improved performance over trained networks. – Paper
  • AnyGPT: A unified multimodal language model that can process multiple modalities including text, speech, images, and music, using discrete representations and can be trained stably without any alterations to the current large language model (LLM) architecture – Paper
  • Watermark Large Transformers: The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance. This paper explores watermarks and addresses this issue by embedding a unique identifier into the model, with virtually no computational cost – Paper

In-Person Data Engineering Conference

April 23rd to 24th, 2024 – Boston, MA

At our second annual Data Engineering Summit, Ai+ and ODSC are partnering to bring together the leading experts in data engineering and thousands of practitioners to explore different strategies for making data actionable.


Start-Up Funding News

Silicon Valley-based Figure AI raised $675 million from investors including Jeff Bezos, Nvidia, Microsoft, and OpenAI. Figure AI aims to transform the labor market with robots

California-based Glean, an AI-enhanced work assistant and enterprise search startup raises $200M at a $2.2B Valuation

Pennsylvania-based Abridge, a medical conversation AI startup that structures and summarizes medical conversations for doctors patients, and cancer raised over $200M in a Series C round.

Toronto-based Ideogram an artificial intelligence startup that provides generative text-to-image technology, raised over $95M in the Series A round

Israel-based Exodigo, an AI-based platform that offers non-intrusive subsurface image mapping solutions, raised over $110M in Series A.

New AI Skills Await!

Upskill and save 50% – 48 Hour Flash Sale!

Upskill in 2024

Top AI Videos

Sam Altman and Pat Gelsinger Talk Artificial Intelligence

Listen Now!

Blogs or News to Share?

If you have any blogs, industry news, product launches, job postings, fundraising news, or open-source launches to add to our newsletter then please send them to: blogs@odsc.com.



ODSC gathers the attendees, presenters, and companies that are shaping the present and future of data science and AI. ODSC hosts one of the largest gatherings of professional data scientists with major conferences in USA, Europe, and Asia.