ELLIS Institute PIs supported by UK government to research AI safety

We are proud to announce that Principal Investigators Sahar Abdelnabi, Maksym Andriushchenko, and Jonas Geiping have been granted funding by the UK’s Department for Science, Innovation and Technology to research AI safety and security for a duration of two years. The grant will fully support five ELLIS researchers to work full time on the Alignment Project, a cooperation built by the UK’s AI Security Institute.

The ELLIS Institute team will research an AI model’s test awareness - meaning their ability to detect when they are being tested - and therefore directly contribute to The Alignment’s Projects main priorities: How can AI systems be prevented from risking the collective security, even if they may try? And how can AI systems be designed to make sure they will not act in such a way at all?

One of the foci of the research collective will be AI models that behave differently during evaluation than during their deployment. These cases pose problems to the ability to reliably assess whether safety properties will hold in real-life situations. This project will provide the community with conceptual frameworks, measurement tools, datasets, and concrete intervention techniques to mitigate current and future risks of deviant AI models.

The results will be open-source, offering benchmarks, probing methods, organisms training recipes, and steering codebases that will help the research community better understand and decrease test awareness concerns. In return, this project will strengthen the community’s prowess in developing AI systems that remain aligned and controlled in various situations after deployment.

The research will advance multiple research areas prioritized by The Alignment Project:

Stress-Testing and Preventing Gaming Behaviors
Model Organisms for Safety Research
Understanding Training Dynamics
Accessing Internal Mechanisms
Making Alignment Challenges Measurable
Critical Domain Application: AI Research & Development Safety

These areas of research can ensure that AI systems will not game, strategically underperform during evaluation, or exploit rewards. The techniques developed by the group will offer interventions to counteract an AI model’s test awareness and offer a reliable way to stress-test models.

The Alignment Project is a collaboration of government, industry, and philanthropic funders strengthening the community and providing funding for AI research. It recognizes that AI will likely play a large role in aligning future AI systems, making it a priority to understand where risks associated with this are undermining alignment research.

Sahar joined the ELLIS Institute Tübingen in October 2025 and is leading the Cooperative Machine Intelligence for People-Aligned Safe Systems (COMPASS) research group. Her group is working on developing safe, aligned, and steerable AI agents with emphasis on security, human aspects, and cooperative multi-agent systems.

Jonas joined ELLIS Institute in October 2023. He leads a large team working on safety- & efficiency- aligned learning. His group is investigating the feasibility of technical solutions to safety, security in machine learning.

Maksym joined the Institute in September 2025 and established the AI Safety and Alignment research group. His research focuses on the safety and alignment of autonomous LLM agents, which are becoming increasingly capable and pose a variety of emerging risks. His group is investigating rigorous AI evaluations, which are key for informing the public about the risks and capabilities of frontier AI models.

Sahar, Jonas, and Maksym are all co-affiliated with the Max-Planck-Institute for Intelligent Systems and the Tübingen AI Center as Independent Research Group Leaders.

The UK’s Department for Science, Innovation and Technology is driving innovation with the end goal of changing lives for the better and sustaining economic growth. One of their endeavors is the AI Security Institute that brought The Alignment Project to life. Their programs involve supporting different talents, physical and digital infrastructure to regulate and support the economy, security and public services.

Find out more about Sahar’s, Jonas’s, Maksym’s research.

Find out more about the AI Security Institute.

Find out more about the UK’s Department for Science, Innovation and Technology.

About

Research

Research Projects

People

News

Events

Press

Careers

ELLIS Institute PIs supported by UK government to research AI safety