OpenAI Creates a Team to Examine Catastrophic Risks of AI

OpenAI recently announced it is developing formal AI risk guidelines and assembling a team dedicated to monitor and study threat assessment involving imminent “superintelligence” AI, also called frontier models. Topics under review include the required parameters for a robust monitoring and prediction framework and how malicious actors might want to leverage stolen AI model weights. The announcement was made shortly prior to the Biden administration issuing an executive order requiring the major players in artificial intelligence to submit reports to the federal government assessing potential risks associated with their models.

The new OpenAI team, “called Preparedness, will be led by Aleksander Madry, the director of MIT’s Center for Deployable Machine Learning,” writes TechCrunch.

Chief among Preparedness’ responsibilities is tracking, forecasting and protecting against future AI systems in enemy hands for uses that range from simple trickery like misinformation and phishing attacks to malicious code-generating capabilities.

“Some of the risk categories Preparedness is charged with studying seem more far-fetched than others,” TechCrunch suggests, citing an OpenAI blog post that lists “chemical, biological, radiological and nuclear” threats as areas of top concern.

Madry’s team “will tightly connect capability assessment, evaluations, and internal red teaming for frontier models, from the models we develop in the near future to those with AGI-level capabilities,” the blog post says, referring to artificial general intelligence.

The team will help track, evaluate, forecast and protect against catastrophic risks across categories including:

  • Individualized persuasion
  • Cybersecurity
  • Chemical, biological, radiological, and nuclear (CBRN) threats
  • Autonomous replication and adaptation (ARA)

This last refers to the ability of an AI system to duplicate itself and adapt to changing circumstances without human assistance through means such as generating its own code.

Hjalmar Wijk, a researcher at the San Francisco-based Alignment Research Center, a non-profit devoted to ethical AI, writes in “Autonomous Replication and Adaptation: An Attempt at a Concrete Danger Threshold” of AI that can make its own money “through cybercrime” or even gig work, using the money “to acquire more computing power.”

To identify more subtle areas of concern (as well as to talent-hunt), OpenAI is launching the AI Preparedness Challenge for catastrophic misuse prevention, offering $25,000 in API credits to up to 10 top submissions.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.