Loading...

Lakera releases open-source benchmark to test LLM security in enterprise AI agents

Lakera releases open-source benchmark to test LLM security in enterprise AI agents

Lakera, together with Check Point Software Technologies and researchers from the UK AI Security Institute, has announced the release of the Backbone Breaker Benchmark (b3). The open-source benchmark is designed to evaluate the security of large language models used as backends in AI agents. The goal is to provide a practical method for developers and organizations to identify how LLMs respond to adversarial inputs at the points where agents are most vulnerable.

The b3 benchmark introduces the concept of “threat snapshots.” Instead of recreating full AI agent workflows, the benchmark targets specific steps where decision-making or tool-use requests occur. These points are where attackers are most likely to manipulate system behavior. By focusing on these moments, testing can be performed more efficiently while still reflecting real-world risks in enterprise environments.

The benchmark includes ten representative threat snapshots supported by a dataset of 19,433 adversarial prompts. The dataset was collected through Gandalf: Agent Breaker, a public hacking simulator in which users attempt to exploit AI agents. The attacks include attempts to extract system prompts, insert phishing links, generate harmful code, interrupt operations, or trigger unauthorized external tool usage.

Testing results from 31 commonly used LLMs show several patterns. Models with stronger reasoning capabilities handle adversarial input more effectively. Model size does not reliably predict security performance. Closed-source models generally perform better than open-weight models, although some open-weight systems are improving and narrowing the difference.

Lakera’s involvement in this project aligns with its focus on securing agentic AI systems. The company develops tools to detect risks in LLM-powered applications. Lakera became part of Check Point Software Technologies in 2024, bringing AI-native security research into Check Point’s enterprise security portfolio. Gandalf: Agent Breaker, which began as an internal experiment, now functions as a continuous source of real-world adversarial data that informs the development of tools like the b3 benchmark.

“We built the b3 benchmark because today’s AI agents are only as secure as the LLMs that power them,” said Mateo Rojas-Carulla, Co-Founder and Chief Scientist at Lakera, a Check Point company. “Threat Snapshots allow us to systematically surface vulnerabilities that have until now remained hidden in complex agent workflows.”

Loading...

Sign up for Newsletter

Select your Newsletter frequency