Zero-Click GenAI Worm Spreads Malware, Poisoning Models

We Keep you Connected

Zero-Click GenAI Worm Spreads Malware, Poisoning Models

35 years after the Morris worm, we’re still dealing with a version of the same issue: data overlapping with control.
March 4, 2024
A worm that uses clever prompt engineering and injection is able to trick generative AI (GenAI) apps like ChatGPT into propagating malware and more.
In a laboratory setting, three Israeli researchers demonstrated how an attacker could design "adversarial self-replicating prompts" that convince a generative model into replicating input as output – if a malicious prompt comes in, the model will turn around and push it back out, allowing it to spread to further AI agents. The prompts can be used for stealing information, spreading spam, poisoning models, and more.
They've named it "Morris II," after the infamous 99-line self-propagating malware which took out a tenth of the entire Internet back in 1988.
To demonstrate how self-replicating AI malware could work, the researchers created an email system capable of receiving and sending emails using generative AI.
Next, as a red team, they wrote a prompt-laced email which takes advantage of retrieval-augmented generation (RAG) — a method AI models use to retrieve trusted external data — to contaminate the receiving email assistant's database. When the email is retrieved by the RAG and sent on to the gen AI model, it jailbreaks it, forcing it to exfiltrate sensitive data and replicate its input as output, thereby passing on the same instructions to further hosts down the line.
The researchers also demonstrated how an adversarial prompt can be encoded in an image to similar effect, coercing the email assistant into forwarding the poisoned image to new hosts. By either of these methods, an attacker could automatically propagate spam, propaganda, malware payloads, and further malicious instructions through a continuous chain of AI-integrated systems.
Most of today's most advanced threats to AI models are just new versions of the oldest security problems in computing.
"While it's tempting to see these as existential threats, these are no different in threat than the use of SQL injection and similar injection attacks, where malicious users abuse text-input spaces to insert additional commands or queries into a supposedly sanitized input," says Andrew Bolster, senior R&D manager for data science at Synopsys. "As the research notes, this is a 35-year-old idea that still has legs (older in fact; father-of-modern-computing-theory John Von Neumann theorized on this in the 50s and 60s)."
Part of what made the Morris worm novel in its time three decades ago was the fact that it figured out how to jump the data space into the part of the computer that exerts controls, enabling a Cornell grad student to escape the confines of a regular user and influence what a targeted computer does.
"A core of computer architecture, for as long as there have been computers, has been this conceptual overlap between the data space and the control space — the control space being the program instructions that you are following, and then having data that's ideally in a controlled area," Bolster explains.
Clever hackers today use GenAI prompts largely to the same effect. And so, just like software developers before them, for defense, AI developers will need some way to ensure their programs don't confuse user input for machine output. Developers can offload some of this responsibility to API rules, but a deeper solution might involve breaking up the gen AI models themselves into constituent parts. This way, data and control aren't living side-by-side in the same big house.
"We're really starting to work on: How do we go from this everything-in-one-box approach, to going for more of a distributed multiple agent approach," Bolster says. "If you want to really squint at it, this is kind of analogous to the shift in microservices architecture from one big monolith. With everything in a services architecture, you're able to put runtime content gateways between and around different services. So you as a system operator can ask 'Why is my email agent expressing things like images?' and put constraints on."
Nate Nelson, Contributing Writer

Nate Nelson is a freelance writer based in New York City. Formerly a reporter at Threatpost, he contributes to a number of cybersecurity blogs and podcasts. He writes "Malicious Life" — an award-winning Top 20 tech podcast on Apple and Spotify — and hosts every other episode, featuring interviews with leading voices in security. He also co-hosts "The Industrial Security Podcast," the most popular show in its field.
You May Also Like
Assessing Your Critical Applications’ Cyber Defenses
Unleash the Power of Gen AI for Application Development, Securely
The Anatomy of a Ransomware Attack, Revealed
How To Optimize and Accelerate Cybersecurity Initiatives for Your Business
Building a Modern Endpoint Strategy for 2024 and Beyond
Cybersecurity’s Hottest New Technologies – Dark Reading March 21 Event
Black Hat Asia – April 16-19 – Learn More
Black Hat Spring Trainings – March 12-15 – Learn More
Industrial Networks in the Age of Digitalization
Zero-Trust Adoption Driven by Data Protection
How Enterprises Assess Their Cyber-Risk
Enterprise Cybersecurity Plans in a Post-Pandemic World
SANS 2021 Cloud Security Survey
The State of Incident Response
A Solution Guide to Operational Technology Cybersecurity
Endpoint Best Practices to Block Ransomware
2023 Snyk AI-Generated Code Security Report
Understanding AI Models to Future-Proof Your AppSec Program
Cybersecurity’s Hottest New Technologies – Dark Reading March 21 Event
Black Hat Asia – April 16-19 – Learn More
Black Hat Spring Trainings – March 12-15 – Learn More
Copyright © 2024 Informa PLC Informa UK Limited is a company registered in England and Wales with company number 1072954 whose registered office is 5 Howick Place, London, SW1P 1WG.