Cookie Consent
Hi, this website uses essential cookies to ensure its proper operation and tracking cookies to understand how you interact with it. The latter will be set only after consent.
Read our Privacy Policy
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.

Lakera Research: Advancing AI Security

Lakera's research team is on the mission to secure the Internet of Agents. We uncover fundamental AI vulnerabilities, push the limits of adversarial AI, and develop defenses that reshape how AI systems withstand attacks. Our work combines cutting-edge research with real-world impact, setting new standards for securing autonomous systems.

Latest Research Updates

This section will be regularly updated with insights from our red teaming efforts, including new findings, methodologies, interactive demos, and potential attack vectors that we uncover.

Featured Research

Gandalf: Adaptive Defenses for Large Language Models

This research introduces D-SEC, a threat model that separates attackers from legitimate users and captures dynamic, multi-step interactions. Using Gandalf—a crowd-sourced red-teaming platform—we analyze 279k real-world attacks and show how some defenses degrade usability. We highlight effective strategies like adaptive defenses and defense-in-depth.

Featured Research

Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents

AI agents powered by large language models (LLMs) are being deployed at scale, yet we lack a systematic understanding of how the choice of backbone LLM affects agent security. The non-deterministic sequential nature of AI agents complicates security modeling, while the integration of traditional software with AI components entangles novel LLM vulnerabilities with conventional security risks. Existing frameworks only partially address these challenges as they either capture specific vulnerabilities only or require modeling of complete agents. To address these limitations, we introduce threat snapshots: a framework that isolates specific states in an agent's execution flow where LLM vulnerabilities manifest, enabling the systematic identification and categorization of security risks that propagate from the LLM to the agent level. We apply this framework to construct the b3 benchmark, a security benchmark based on 194,331 unique crowdsourced adversarial attacks. We then evaluate 34 popular LLMs with it, revealing, among other insights, that enhanced reasoning capabilities improve security, while model size does not correlate with security. We release our benchmark, dataset, and evaluation code to facilitate widespread adoption by LLM providers and practitioners, offering guidance for agent developers and incentivizing model developers to prioritize backbone security improvements.

Featured Research

A Safety and Security Framework for Real-World Agentic Systems

This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and security are not merely fixed attributes of individual models but also emergent properties arising from the dynamic interactions among models, orchestrators, tools, and data within their operating environments. We propose a new way of identification of novel agentic risks through the lens of user safety. Although, for traditional LLMs and agentic models in isolation, safety and security has a clear separation, through the lens of safety in agentic systems, they appear to be connected. Building on this foundation, we define an operational agentic risk taxonomy that unifies traditional safety and security concerns with novel, uniquely agentic risks, including tool misuse, cascading action chains, and unintended control amplification among others. At the core of our approach is a dynamic agentic safety and security framework that operationalizes contextual agentic risk management by using auxiliary AI models and agents, with human oversight, to assist in contextual risk discovery, evaluation, and mitigation. We further address one of the most challenging aspects of safety and security of agentic systems: risk discovery through sandboxed, AI-driven red teaming. We demonstrate the framework effectiveness through a detailed case study of NVIDIA flagship agentic research assistant, AI-Q Research Assistant, showcasing practical, end-to-end safety and security evaluations in complex, enterprise-grade agentic workflows. This risk discovery phase finds novel agentic risks that are then contextually mitigated. We also release the dataset from our case study, containing traces of over 10,000 realistic attack and defense executions of the agentic workflow to help advance research in agentic safety.

“We have been impressed throughout our collaboration with Lakera”

Trusted by GenAI leaders to secure mission-critical applications.

“The team has extensive expertise and deep understanding of complex security challenges like prompt injection attacks and other AI security threats. We look forward to continuing to work together to address these.”

Read case study
Seraphina Goldfarb-Tarrant
Head of Safety at Cohere

Meet the Scientists Behind Lakera's AI Security Research

Our research team consists of experts in AI security, machine learning, and adversarial defense strategies. They work at the intersection of cutting-edge research and practical security applications, ensuring AI systems remain robust and resilient.

André Holzner
Staff Research Engineer

André Holzner earned a PhD in Physics from ETH Zurich. Before joining the team, he worked at CERN and contributed to projects in Meta’s Virtual Reality division.

Martin Engilberge
Senior Research Engineer

Martin Engilberge completed his PhD in Computer Science at Sorbonne University and later held a postdoctoral position at EPFL in Lausanne.

Julia Bazinska
Senior Research Engineer

Julia Bazinska graduated from ETH Zurich with a Master’s degree. Earlier in her career, she worked at organizations including DeepMind, Google, and IBM.

Niklas Pfister
Staff Research Scientist

Niklas Pfister received his PhD in Mathematics from ETH Zurich. He also serves as a professor at the University of Copenhagen.

Kyriacos Shiarlis
Staff Research Scientist

Kyriacos Shiarlis earned his PhD in Machine Learning from the University of Amsterdam and later completed a postdoctoral position at Oxford. Prior to joining the team, he worked at Waymo.

Mateo Rojas-Carulla
Head of Research

Mateo Rojas holds a PhD in Machine Learning from the University of Cambridge and the Max Planck Institute. Before co-founding Lakera, he worked at Google and Meta Research, as well as Credit Suisse and Speechmatics, focusing on large language models.

Join Us in Securing the Future of AI

We invite researchers, developers, and security professionals to collaborate with us. Whether you’re interested in contributing to our projects, testing new defense strategies, or exploring novel AI security concepts, we welcome you to join us.

Contact Us