In a new blog post, OpenAI has unveiled its latest initiative called “Superalignment” which aims to create robust and scalable methods for aligning these systems with human values in anticipation of Superintelligence. For those who might not know, Superintelligence is a highly advanced form of AI. However, the sheer power of superintelligence also carries inherent risks.
OpenAI believes that superintelligence may become a reality within this decade. This is why they have announced their work with “Superalignment” as a means of effective governance and alignment mechanisms.
As of right now, conventional AI alignment techniques, such as reinforcement learning from human feedback, rely on human supervision. However, when dealing with AI systems surpassing human intelligence, the ability of humans to supervise becomes limited.
Because of this, OpenAI believes that existing alignment techniques will not scale once superintelligence is achieved. To solve this coming problem, they want to develop an automated alignment researcher that approaches human-level capabilities.
This alignment researcher will serve as a foundation for scaling alignment efforts using substantial computational resources. To make this a reality, the Superalignment team aims to focus on three key aspects: scalable training methods, validation of alignment models, and stress testing the alignment pipeline.
But what of the tasks which are too challenging for humans to properly evaluate? To solve this issue, the team proposes leveraging AI systems to assist in the evaluation process. This approach, known as scalable oversight, can provide a training signal on difficult tasks. Additionally, the team aims to understand and control how models generalize the oversight to unsupervised tasks, addressing the crucial issue of generalization.
From there, OpenAI plans to automate the search for problematic behavior and internals to enhance robustness and interpretability. By identifying potential issues, the team aims to build alignment models that can reliably align superintelligent AI systems with human values.
Even though it seems that OpenAI wants to ensure that the next phase of AI brings with it minimal harm, challenges and risks associated with superintelligence will stay on the minds of many researchers. This is the primary reason why this spring there have been two public calls for AI regulation and a pause of AI research.
With all that said, OpenAI believes there is reason to be optimistic. In their view, a focused and concerted effort can lead to the successful alignment of superintelligent AI systems which would in theory bypass dangers inherent to superintelligent systems.
Editor’s Note: Are you ready to learn about the latest in generative AI? Join us for the one-day Generative AI summit. Go beyond the hype and dive deeper into this cutting-edge technology. Register now for free and unlock the power of generative AI.