Generative AI is taking the world by storm right now. With text-to-image apps like DALL-E entering the mainstream, to even non-AI enthusiasts using Q&A systems like ChatGPT, AI is now a part of the modern zeitgeist. Over the past year, we’ve seen a number of generative AI projects make a name for themselves, whether through social media or by frequent citations in academic papers. Here are our picks for a few generative AI projects that are worth checking out for yourself, most of which you can experiment with for free.
The current state of generative AI projects | Source
Text Generative AI
Easily the biggest thing in AI right now is ChatGPT, a project from OpenAI that can provide human-level responses to prompts, all trained via Reinforcement Learning from Human Feedback (RLHF). From recipes to writing blogs, it’s amazing how well it performs. It’s more than just a simple Q&A model like a chatbot, as it can admit when it’s wrong or doesn’t have enough data, write out responses that are clear and comprehensive, write code, and even write out responses at length, making research a breeze for people.
While the AI in itself isn’t controversial, its uses are facing some issues, as students are using it to write their essays, and one Princeton student even made GPTZero which can detect if something was written by ChatGPT. Neither of these have turned off Microsoft, as they’re investing $10B in OpenAI for ChatGPT specifically.
While we mostly hear about ChatGPT lately, GPT-3 isn’t going anywhere anytime soon, as it’s finally finding plenty of practical applications in businesses. Also developed by OpenAI, GPT-3 is a large language model designed for plenty of natural language tasks like translation, text generation, and summarization, while ChatGPT is a tool under GPT-3 specifically designed for Q&A and chatbot functionality.
Because of its wider use, GPT-3 may not be as user-friendly to those not AI-savvy unlike ChatGPT, though it has much broader potential.
While not a full-fledged application, CodeGPT is an extension for VSCode that allows you to use GPT-3 inside VSCode through the official OpenAI API. CodeGPT makes it easy for you to generate code, explain code, refactor code, and more.
Hugging Face BLOOM
Announced earlier last year, BLOOM (BigScience Large Open-science Open-access Multilingual Language Model), is able to generate text in 46 natural languages and 13 programming languages, and is the first language model with over 100B parameters ever created.
BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. BLOOM can also be instructed to perform text tasks it hasn’t been explicitly trained for, by casting them as text generation tasks.
Another tool from OpenAI, Whisper is a general-purpose speech recognition model, trained on mass amounts of diverse audio data, and can perform multilingual speech recognition, translation, and language identification. Its high accuracy levels make OpenAI hopeful that people will use it for practical purposes in translation and speech recognition. Whisper can interpret multiple languages and even output translations in multiple languages.
Visual Generative AI
OpenAI’s DALL-E 2 took social media by storm over the summer thanks to its ability to make images on text prompts. By using natural language prompts, like “the last selfie that a human would ever take on Earth,” DALL-E generates art based on all the images it was trained on, sometimes making something beautiful that can win art competitions, sometimes creating sins against humanity with horrific outputs.
Formerly known as DALL-E Mini, Craiyon is essentially DALL-E light, as it makes more, but simpler art pieces using a text prompt. This was especially popular over the summer with people filling up their Instagram feeds with silly prompts. It’s pretty fun to use if you’re curious about making quick mashups to kill some time. Even John Oliver devoted an entire skit to the AI by marrying a cabbage.
Similar to DALL-E, Stable Diffusion is a latent text-to-image diffusion model. One thing that separates it from other text-to-image generators is that the outputs tend to more closely resemble real life, as opposed to stylized, artistic outputs from related AI apps. Stable Diffusion was trained on the 2b English language label subset of LAION 5b, a general crawl of the internet created by the German charity LAION.
Google’s answer to both DALL-E and Make-A-Video, Imagen can turn text into both images and videos. Developed by the Google Brain team, Imagen “builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation.” Its outputs can be both stylized and photorealistic. You can even use Imagen Editor & EditBench to edit existing images without having to create new ones.
Going a step further than text-to-image generators, Meta’s Make-A-Video is exactly what it sounds like – text-to-video AI. The AI app can turn text prompts into short videos, more akin to GIFs rather than YouTube-length content – enough to show an idea in motion. Make-A-Video can turn text prompts into outputs in a variety of styles, or even add motion to a static image.
These ten generative AI projects are just a few among many making waves right now. Considering how young the field is and how quickly generative AI exploded in prominence, I fully expect to see more startups and tech powerhouses develop similar tools in 2023.
What are a few generative AI projects that have caught your attention? What’s new that you think should make a follow-up list? Let us know!
If you’re looking to learn more about the generative AI field and generative AI projects, then check out ODSC East 2023. We’re currently working on an entire track devoted to generative AI, so subscribe to our newsletter and be the first to hear the details. In the meantime, register for ODSC East now while tickets are still 60% off.