Gone in 60 seconds: BEAST AI model attack needs just a minute of GPU time to breach LLM guardails

Guardrails keep AI chatbots behaving. But there’s a bit of an arms race going on between defenders of those guardrails and attackers working to breach them. A bit of research reported in The Register reports how that’s going:

“We get a 65x speedup with our method over existing gradient-based attacks. There are also other methods that require access to more powerful models, such as GPT-4, to perform their attacks, which can be monetarily expensive.”

How long will it be before these sorts of attacks are deployed at scale?

Read the report here.

Leave a comment