In a new blog, Stability AI and its CarperAI lab unveiled Stable Beluga 1 and its successor, Stable Beluga 2 (previously known as FreeWilly). According to their post, the goal of these two large language models is to expand and create a new standard for open-access AI research.
Stable Beluga 1 builds upon the foundation of the LLaMA 65B model. It is fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in the standard Alpaca format. Similarly, Stable Beluga 2 leverages the power of the LLaMA 2 70B foundation model. According to the post, this puts gives it industry-leading performance.
Both models emerged as compelling research experiments to push forward open research initiatives under a non-commercial license. The internal team ensured both models are “polite and benign in nature”. But they are also hoping that the community helps and participates in further red-teaming.
The journey of data generation and collection for the Stable Beluga models draws inspiration from Microsoft’s methodology outlined in the paper “Orca: Progressive Learning from Complex Explanation Traces of GPT-4.” In the post, they mentioned that their process was similar, but the team took another route when it came to data sources.
They went on to mention that the synthetic dataset, contained 600,000 data points. It was curated from high-quality instructions and was a variant of Enrico Shippole’s datasets:
- COT Submix Original
- NIV2 Submix Original
- FLAN 2021 Submix Original
- T0 Submix Original
Further in the post, they mentioned that the filtering of these datasets removed examples from evaluation benchmarks. According to them, it was to ensure a level playing field. Despite training on a fraction of the data used by the original Orca paper, Stable Beluga models were able to exhibit remarkable performance across diverse benchmarks. In Stability AI’s view, this validated their approach to synthetically-generated datasets.
Finally, the post mentioned that Hugging Face was able to validate the metrics of both Beluga models. The results were then published on their Open LLM Leaderboard. Currently, Stable Beluga 2 claims the second spot, while Stable Beluga 1 is currently ranked seventh.
Editor’s Note: Deep Learning is becoming a critical topic in the future of AI development, and if you want to stay on the frontlines of the latest developments, then you need to hear from the industry leaders leading the charge. You’ll get that at the ODSC West 2023 Deep Learning & Machine Learning Track. Save your seat and register today.