AI Chatbots Highly Vulnerable to Jailbreaks, UK Researchers Find

May 20, 2024

AI Chatbots Highly Vulnerable to Jailbreaks, UK Researchers Find

Four of the most used generative AI chatbots are highly vulnerable to basic jailbreak attempts, researchers from the UK AI Safety Institute (AISI) found.

In a May 2024 update published ahead of the AI Seoul Summit 2024, co-hosted by the UK and South Korea on 21-22 May, the UK AISI shared the results of a series of tests performed on five leading AI chatbots.

The five generative AI models are anonymized in the report. They are referred to as the Red, Purple, Green, Blue and Yellow models.

The UK AISI performed a series of tests to assess cyber risks associated with these models.

These included:

Tests to assess whether they are vulnerable to jailbreaks, actions designed to bypass safety measures and get the model to do things it is not supposed to
Tests to assess whether they could be used to facilitate cyber-attacks
Tests to assess whether they are capable of autonomously taking sequences of actions (operating as “agents”) in ways that might be difficult for humans to control

The AISI researchers also tested the models to estimate whether they could provide expert-level knowledge in chemistry and biology that could be used for positive and harmful purposes.

Bypassing LLM Safeguards in 90%-100% of Cases

The UK AISI tested four of the five large language models (LLMs) against jailbreak attacks.

All proved to be highly vulnerable to basic jailbreak techniques, with the models actioning harmful responses in between 90% and 100% of cases when the researchers performed the same attack patterns five times in a row.

Tags:

No tags.

Subscribe to our newsletter

About JikGuard.com

JikGuard.com, a high-tech security service provider focusing on game protection and anti-cheat, is committed to helping game companies solve the problem of cheats and hacks, and providing deeply integrated encryption protection solutions for games.

Explore Features>>

Top

Top Mobile Game Security: Essential Solutions to Protect Your Game

Top Mobile Game Security: Essential Solutions to Protect Your Game

None

Mastering String Encryption: Protect Your Games from Hackers

None

JNI Protection in Games

None

UK’s NCSC Offers Security Tips as Co-op Confirms Data Loss

None

Darcula Phishing as a Service Operation Snares 800,000+ Victims

Recent

Top Mobile Game Security: Essential Solutions to Protect Your Game

Top Mobile Game Security: Essential Solutions to Protect Your Game

None

Mastering String Encryption: Protect Your Games from Hackers

None

JNI Protection in Games

None

UK’s NCSC Offers Security Tips as Co-op Confirms Data Loss

None

Darcula Phishing as a Service Operation Snares 800,000+ Victims

None

Inside DragonForce, the Group Tied to M&S, Co-op and Harrods Hacks

None

Smishing Triad Upgrades Tools and Tactics for Global Attacks

None

Texas School District Notifies Over 47,000 People of Major Data Breach

None

Bungie's content cycling on Destiny 2 causes legal kerfuffle in plagiarism suit

None

NCSoft invests in TX-based studio founded by id, Naughty Dog alums

Popular

None

Ransomware Attacks Fall in April Amid RansomHub Outage

None

TikTok Fined €530m Over Transfers of European User Data to China

None

Harrods Latest UK Retailer to Fall Victim to Cyber-Attack in Recent Days

None

Third of Online Users Hit by Account Hacks Due to Weak Passwords

None

White House Warns China of Cyber Retaliation Over Infrastructure Hacks

None

CISA Confirms Exploitation of SonicWall Vulnerabilities

None

Microsoft's Q3 report shows slight Xbox growth as fresh price hikes loom

None

GTA VI has been delayed until 2026 but now has an actual release date

None

Vox sells Polygon to Valnet in a blow to video game market

None

Compulsion Games boss: Generative AI usage 'is not mandated' at Xbox

Random

None

UK’s NCSC Offers Security Tips as Co-op Confirms Data Loss

None

UK Retailer Co-op Confirms Hack, Reports "Small Impact" to Its Systems

None

Texas School District Notifies Over 47,000 People of Major Data Breach

None

Compulsion Games boss: Generative AI usage 'is not mandated' at Xbox

None

'It was really wearing people down:' The funding challenges that followed 'runaway success' Another Crab's Treasure

None

Darcula Phishing as a Service Operation Snares 800,000+ Victims

None

Mastering String Encryption: Protect Your Games from Hackers

None

GTA VI has been delayed until 2026 but now has an actual release date

None

Smishing Triad Upgrades Tools and Tactics for Global Attacks

None

Microsoft Expands Cloud, AI Footprint Across Europe

Most Views

None

France Slams Russia’s APT28 for Four-Year Cyber-Espionage Campaign

None

JPMorgan CISO Urges SaaS Security Reset

None

US House Approves Bill to Assess Security Threats Posed by Foreign-Made Routers

None

DHS Head Accuses CISA of Acting Like “the Ministry of Truth”

None

UK Retailer Co-op Confirms Hack, Reports "Small Impact" to Its Systems

None

Microsoft Expands Cloud, AI Footprint Across Europe

None

RansomHub Refines Extortion Strategy as RaaS Market Fractures

None

Report: Russia aims to seize assets of former Wargaming subsidiary Lesta Studio

None

Report: EA lays off hundreds of workers after canceling games at Respawn Entertainment

None

'We are frustrated:' ZeniMax union workers continue to pressure Microsoft over contract negotiations