Cookie Consent

Hi, this website uses essential cookies to ensure its proper operation and tracking cookies to understand how you interact with it. The latter will be set only after consent.

Lakera Research: Advancing AI Security

Lakera's research team is on the mission to secure the Internet of Agents. We uncover fundamental AI vulnerabilities, push the limits of adversarial AI, and develop defenses that reshape how AI systems withstand attacks. Our work combines cutting-edge research with real-world impact, setting new standards for securing autonomous systems.

Latest Research Updates

This section will be regularly updated with insights from our red teaming efforts, including new findings, methodologies, interactive demos, and potential attack vectors that we uncover.

How to Run the Backbone Breaker Benchmark (B3)

Learn how to run the Backbone Breaker Benchmark (b3) to evaluate how resilient backbone LLMs are against real adversarial attacks drawn from Gandalf: Agent Breaker.

min

March 10, 2026

The Agent Skill Ecosystem: When AI Extensions Become a Malware Delivery Channel (OpenClaw Hackathon Findings)

A large-scale audit of 4,310 OpenClaw skills revealing confirmed malware delivery and systemic supply chain risks in the agent marketplace.

min

February 12, 2026

Memory Poisoning & Instruction Drift: From Discord Chat to Reverse Shell (OpenClaw Hackathon Findings)

A controlled experiment showing how persistent memory and instruction drift in an AI agent led to reverse shell execution via Discord alone.

min

February 12, 2026

Red Teaming Agentic Capabilities in NVIDIA NeMo Agent Toolkit

A practical walkthrough of how NeMo Agent Toolkit v1.4 enables system-level red teaming for AI agents, with structured risk scoring and end-to-end evaluation of agentic workflows.

min

February 3, 2026

The Backbone Breaker Benchmark: Testing the Real Security of AI Agents

The Backbone Breaker Benchmark (b3) is the first human-grounded, threat-realistic benchmark for AI agents, making LLM security measurable, reproducible, and comparable across models.

min

October 28, 2025

Zero-Click Remote Code Execution: Exploiting MCP & Agentic IDEs

A zero-click exploit shows how a single Google Doc can trigger MCP abuse in Cursor, leading to stolen credentials, persistence, and enterprise risk.

min

September 3, 2025

Read Our Blog Explore Lakera's AI Red Teaming

Featured Research

Gandalf: Adaptive Defenses for Large Language Models

This research introduces D-SEC, a threat model that separates attackers from legitimate users and captures dynamic, multi-step interactions. Using Gandalf—a crowd-sourced red-teaming platform—we analyze 279k real-world attacks and show how some defenses degrade usability. We highlight effective strategies like adaptive defenses and defense-in-depth.

Read the Paper

Access the Dataset

Explore the Code

Featured Research

Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents

AI agents powered by large language models (LLMs) are being deployed at scale, yet we lack a systematic understanding of how the choice of backbone LLM affects agent security. The non-deterministic sequential nature of AI agents complicates security modeling, while the integration of traditional software with AI components entangles novel LLM vulnerabilities with conventional security risks. Existing frameworks only partially address these challenges as they either capture specific vulnerabilities only or require modeling of complete agents. To address these limitations, we introduce threat snapshots: a framework that isolates specific states in an agent's execution flow where LLM vulnerabilities manifest, enabling the systematic identification and categorization of security risks that propagate from the LLM to the agent level. We apply this framework to construct the b3 benchmark, a security benchmark based on 194,331 unique crowdsourced adversarial attacks. We then evaluate 34 popular LLMs with it, revealing, among other insights, that enhanced reasoning capabilities improve security, while model size does not correlate with security. We release our benchmark, dataset, and evaluation code to facilitate widespread adoption by LLM providers and practitioners, offering guidance for agent developers and incentivizing model developers to prioritize backbone security improvements.

Read the Paper

Featured Research

A Safety and Security Framework for Real-World Agentic Systems

This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and security are not merely fixed attributes of individual models but also emergent properties arising from the dynamic interactions among models, orchestrators, tools, and data within their operating environments. We propose a new way of identification of novel agentic risks through the lens of user safety. Although, for traditional LLMs and agentic models in isolation, safety and security has a clear separation, through the lens of safety in agentic systems, they appear to be connected. Building on this foundation, we define an operational agentic risk taxonomy that unifies traditional safety and security concerns with novel, uniquely agentic risks, including tool misuse, cascading action chains, and unintended control amplification among others. At the core of our approach is a dynamic agentic safety and security framework that operationalizes contextual agentic risk management by using auxiliary AI models and agents, with human oversight, to assist in contextual risk discovery, evaluation, and mitigation. We further address one of the most challenging aspects of safety and security of agentic systems: risk discovery through sandboxed, AI-driven red teaming. We demonstrate the framework effectiveness through a detailed case study of NVIDIA flagship agentic research assistant, AI-Q Research Assistant, showcasing practical, end-to-end safety and security evaluations in complex, enterprise-grade agentic workflows. This risk discovery phase finds novel agentic risks that are then contextually mitigated. We also release the dataset from our case study, containing traces of over 10,000 realistic attack and defense executions of the agentic workflow to help advance research in agentic safety.

Read the Paper

Meet the Scientists Behind Lakera's AI Security Research

Our research team consists of experts in AI security, machine learning, and adversarial defense strategies. They work at the intersection of cutting-edge research and practical security applications, ensuring AI systems remain robust and resilient.

André Holzner

Staff Research Engineer

André Holzner earned a PhD in Physics from ETH Zurich. Before joining the team, he worked at CERN and contributed to projects in Meta’s Virtual Reality division.

Martin Engilberge

Senior Research Engineer

Martin Engilberge completed his PhD in Computer Science at Sorbonne University and later held a postdoctoral position at EPFL in Lausanne.

Julia Bazinska

Senior Research Engineer

Julia Bazinska graduated from ETH Zurich with a Master’s degree. Earlier in her career, she worked at organizations including DeepMind, Google, and IBM.

Niklas Pfister

Staff Research Scientist

Niklas Pfister received his PhD in Mathematics from ETH Zurich. He also serves as a professor at the University of Copenhagen.

Kyriacos Shiarlis

Staff Research Scientist

Kyriacos Shiarlis earned his PhD in Machine Learning from the University of Amsterdam and later completed a postdoctoral position at Oxford. Prior to joining the team, he worked at Waymo.

‍

Mateo Rojas-Carulla

Head of Research

Mateo Rojas holds a PhD in Machine Learning from the University of Cambridge and the Max Planck Institute. Before co-founding Lakera, he worked at Google and Meta Research, as well as Credit Suisse and Speechmatics, focusing on large language models.

Join Us in Securing the Future of AI

We invite researchers, developers, and security professionals to collaborate with us. Whether you’re interested in contributing to our projects, testing new defense strategies, or exploring novel AI security concepts, we welcome you to join us.

Lakera Research: Advancing AI Security

Latest Research Updates

Trusted by GenAI leaders to secure mission-critical applications.

Meet the Scientists Behind Lakera's AI Security Research

Join Us in Securing the Future of AI