fbpx
BUD-E: An Open-Source Voice Assistant by LAION that Runs on a Laptop BUD-E: An Open-Source Voice Assistant by LAION that Runs on a Laptop
LAION has introduced, BUD-E, an open-source voice assistant that is able to run on a gaming laptop and does not require... BUD-E: An Open-Source Voice Assistant by LAION that Runs on a Laptop

LAION has introduced, BUD-E, an open-source voice assistant that is able to run on a gaming laptop and does not require an active internet connection. If proven to be scalable, voice assistant technology could see a major boost in the near future.

In LAION’s blog, the company explains that BUD-E came into being in collaboration with the ELLIS Institute Tübingen, Collabora, and the Tübingen AI Center. The team is hoping to help AI assistant technology address the cap in nuanced understanding and emotional intelligence that is inherent to human dialogue.

In-Person and Virtual Conference

September 5th to 6th, 2024 – London

Featuring 200 hours of content, 90 thought leaders and experts, and 40+ workshops and training sessions, Europe 2024 will keep you up-to-date with the latest topics and tools in everything from machine learning to generative AI and more.

 

And that was how BUD-E (Buddy for Understanding and Digital Empathy) was born. The goal of this program is to help in creating voice assistants that not only respond in real-time but do so with a depth of empathy and understanding previously unseen.

According to the blog, by harnessing the capabilities of advanced Speech-to-Text, Large Language, and Text-to-Speech models, BUD-E aims to minimize the latency and mechanical nature of responses, aiming for a seamless conversational flow.

As of January 2024, the project has achieved latencies between 300 to 500 milliseconds using the Phi 2 model, with aspirations to further reduce this with larger models like LLama 2 30B. The team is aware that the road journey toward an empathic and naturally conversational AI is filled with challenges requiring new solutions.

Key among these is the reduction of system latency and requirements through sophisticated quantization techniques and the fine-tuning of streaming text-to-speech (TTS) and speech-to-text (STT) models. Enhancing the naturalness of speech and responses involves building a dataset of natural human dialogues and developing a reliable speaker-diarization system.

To achieve this, BUD-E aims to maintain continuity over conversations spanning days, months, or even years, leveraging Retrieval Augmented Generation (RAG) for improved performance. And with the project’s open-source foundation, researchers and developers hope to take on the challenges associated with emotional context understanding.

In-Person & Virtual Data Science Conference

October 29th-31st, 2024 – Burlingame, CA

Join us for 300+ hours of expert-led content, featuring hands-on, immersive training sessions, workshops, tutorials, and talks on cutting-edge AI tools and techniques, including our first-ever track devoted to AI Robotics!

 

But what does this exactly mean in the medium to long term? First, there’s a growing interest in assisting models that power voice assistants, to understand contextual language. This of course includes a level of emotional language.

This is because as AI becomes more prevalent in the daily lives of people, its ability to interact more recognizably will become important. If you’re interested in the BUD-E project, you can check out LAION’s GitHub page.

ODSC Team

ODSC Team

ODSC gathers the attendees, presenters, and companies that are shaping the present and future of data science and AI. ODSC hosts one of the largest gatherings of professional data scientists with major conferences in USA, Europe, and Asia.

1