fbpx
BUD-E: An Open-Source Voice Assistant by LAION that Runs on a Laptop BUD-E: An Open-Source Voice Assistant by LAION that Runs on a Laptop
LAION has introduced, BUD-E, an open-source voice assistant that is able to run on a gaming laptop and does not require... BUD-E: An Open-Source Voice Assistant by LAION that Runs on a Laptop

LAION has introduced, BUD-E, an open-source voice assistant that is able to run on a gaming laptop and does not require an active internet connection. If proven to be scalable, voice assistant technology could see a major boost in the near future.

In LAION’s blog, the company explains that BUD-E came into being in collaboration with the ELLIS Institute Tübingen, Collabora, and the Tübingen AI Center. The team is hoping to help AI assistant technology address the cap in nuanced understanding and emotional intelligence that is inherent to human dialogue.

In-Person and Virtual Conference

April 23rd to 25th, 2024

Join us for a deep dive into the latest data science and AI trends, tools, and techniques, from LLMs to data analytics and from machine learning to responsible AI.

 

And that was how BUD-E (Buddy for Understanding and Digital Empathy) was born. The goal of this program is to help in creating voice assistants that not only respond in real-time but do so with a depth of empathy and understanding previously unseen.

According to the blog, by harnessing the capabilities of advanced Speech-to-Text, Large Language, and Text-to-Speech models, BUD-E aims to minimize the latency and mechanical nature of responses, aiming for a seamless conversational flow.

As of January 2024, the project has achieved latencies between 300 to 500 milliseconds using the Phi 2 model, with aspirations to further reduce this with larger models like LLama 2 30B. The team is aware that the road journey toward an empathic and naturally conversational AI is filled with challenges requiring new solutions.

Key among these is the reduction of system latency and requirements through sophisticated quantization techniques and the fine-tuning of streaming text-to-speech (TTS) and speech-to-text (STT) models. Enhancing the naturalness of speech and responses involves building a dataset of natural human dialogues and developing a reliable speaker-diarization system.

To achieve this, BUD-E aims to maintain continuity over conversations spanning days, months, or even years, leveraging Retrieval Augmented Generation (RAG) for improved performance. And with the project’s open-source foundation, researchers and developers hope to take on the challenges associated with emotional context understanding.

In-Person Data Engineering Conference

April 23rd to 24th, 2024 – Boston, MA

At our second annual Data Engineering Summit, Ai+ and ODSC are partnering to bring together the leading experts in data engineering and thousands of practitioners to explore different strategies for making data actionable.

 

But what does this exactly mean in the medium to long term? First, there’s a growing interest in assisting models that power voice assistants, to understand contextual language. This of course includes a level of emotional language.

This is because as AI becomes more prevalent in the daily lives of people, its ability to interact more recognizably will become important. If you’re interested in the BUD-E project, you can check out LAION’s GitHub page.

ODSC Team

ODSC Team

ODSC gathers the attendees, presenters, and companies that are shaping the present and future of data science and AI. ODSC hosts one of the largest gatherings of professional data scientists with major conferences in USA, Europe, and Asia.

1