White House calls for “red-teaming” to test security of AI models

10 May 2023

861

The White House announced a surprising collaboration between top AI developers to participate in a public evaluation, i.e., red-teaming, of their generative AI systems at DEF CON 31, a hacker convention taking place in Las Vegas in August.

“Red-teaming” is a process by which security experts attempt to find vulnerabilities or flaws in an organization’s systems to improve overall security and resilience.

The announcement, made on May 4th, involved a surprising collaboration between OpenAI, Google, Antrhopic, Hugging Face, Microsoft, Nvidia and Stability AI, and called on the hacker community to push these new generative AI models, such as ChatGPT, to their limits.

Since last year, large language models (LLMs) such as ChatGPT have become a popular way to accelerate writing and communications tasks, but officials recognize that they also come with inherent risks. Issues such as confabulations, jailbreaks, and biases pose challenges for security professionals and the public.

A statement from the White House said: “This independent exercise will provide critical information to researchers and the public about the impacts of these models and will enable AI companies and developers to take steps to fix issues found in those models.”

The event aligns with the Biden administration’s AI Bill of Rights and the National Institute of Standards and Technology’s AI Risk Management Framework.

In a parallel announcement written by AI Village, organizers Sven Cattell, Rumman Chowdhury, and Austin Carson call the upcoming event “the largest red teaming exercise ever for any group of AI models.” Thousands of people will take part in the public AI model assessment, which will utilize an evaluation platform developed by Scale AI.

According to Cattell, the founder of AI Village, “The diverse issues with these models will not be resolved until more people know how to red team and assess them.” By conducting the largest red-teaming exercise for any group of AI models, AI Village and DEF CON aim to grow the community of researchers equipped to handle vulnerabilities in AI systems.

LLMs have proven surprisingly difficult to lock down in part due to a technique called “prompt injection”. AI researcher Simon Willison has written in detail about the dangers of prompt injection, a technique that can derail a language model into performing actions not intended by its creator.

During the DEF CON event, participants will have timed access to multiple LLMs through laptops provided by the organizers. A capture-the-flag-style point system will encourage testing a wide range of potential harms. At the end, the person with the most points will win a high-end Nvidia GPU.

“We’ll publish what we learn from this event to help others who want to try the same thing,” writes AI Village. “The more people who know how to best work with these models, and their limitations, the better.”

DEF CON 31 will take place on August 10–13, 2023, at Caesar’s Forum in Las Vegas.

(Source: Arstechnica)

White House calls for “red-teaming” to test security of AI models

EXCLUSIVE: Nutrition and lifestyle for cognitive fitness

Sexual safety national consensus formally launched at Ambulance Leadership Forum

London Mayor calls on Government to ban machete-style knives

Must Read

EXCLUSIVE: Nutrition and lifestyle for cognitive fitness

Sexual safety national consensus formally launched at Ambulance Leadership Forum

London Mayor calls on Government to ban machete-style knives

UPDATE: 16-year-old boy stabbed to death in Edmonton named by police – Witnesses allege teenager was ambushed by two balaclava-clad men

LATEST NEWS

EXCLUSIVE: Nutrition and lifestyle for cognitive fitness

Sexual safety national consensus formally launched at Ambulance Leadership Forum

London Mayor calls on Government to ban machete-style knives

SUNDAY EXCLUSIVES

EXCLUSIVE: Nutrition and lifestyle for cognitive fitness

Unsung Security Heroes: Touching story of Ronnie Alexander, frozen to death on duty

“I can’t breathe!”: Positional asphyxia – A silent danger in the use of physical restraint

EDITOR'S PICK

Physiotherapy for security professionals following a heart attack

Online safety: Girls “much more likely to experience something nasty online” than boys, report found

Security guard jailed for killing shoplifter with a “forceful blow”

FOLLOW US

White House calls for “red-teaming” to test security of AI models

Stay Connected

Must Read

LATEST NEWS

SUNDAY EXCLUSIVES

EDITOR'S PICK

FOLLOW US