| Time |
Activity |
| 8:15 - 8:40 |
Welcome Coffee |
| 8:40 - 8:45 |
Welcome remarks by Bernhard Schölkopf |
| 8:45 - 9:10 |
Yao Feng
Stanford University
On Building Intelligent Digital Humans
Bio: Yao Feng is a postdoctoral researcher at Stanford University, working with Karen Liu, Jennifer Hicks, and Scott L. Delp. She received her Ph.D. from ETH Zürich and the Max Planck Institute for Intelligent Systems, advised by Michael J. Black and Marc Pollefeys. Her research interests lie at the intersection of computer vision, graphics, robotics and biomechanics, with a focus on building intelligent digital humans. Beyond academia, Yao spent a year as a Research Scientist at Meshcapade and continues to serve on its Advisory Board. She is a recipient of the Eurographics PhD Award (Honorable Mention) and has been recognized as both an EECS Rising Star and a WiGRAPH Rising Star in Computer Graphics.
Abstract: Humans integrate perception, reasoning, and action in the physical world, yet today’s AI remains fragmented: digital humans look realistic but lack behavioral competence, while powerful AI models are intelligent yet largely disembodied. In this talk, I present a unified framework for building intelligent digital humans across three pillars: Body, Mind, and Action. I first describe scalable methods to build the Body supporting physical and expressive realism. I then introduce the Mind, featuring vision-language systems grounded in 3D behavioral context to infer intent and produce actionable goals. Finally, I show how these ingredients enable Action in interactive settings such as immersive XR, safe humanoid–human interaction, and healthcare applications. I conclude with a roadmap for a closed-loop ecosystem where interaction data continuously improves both intelligence and physicality, moving us toward a future where intelligent digital humans, whether in virtual worlds or embodied in robots, are a safe and natural part of our reality.
|
| 9:20 - 9:45 |
Kashyap Chitta
NVIDIA
Scaling Physical AI via World Modeling
Bio: Kashyap Chitta is a Postdoctoral Researcher at the NVIDIA Autonomous Vehicle Research Group working from Tübingen, Germany. His research focuses on simulation-based training and evaluation of Physical AI systems. Kashyap did a bachelor's degree in electronics at the RV College of Engineering, India. He then moved to the US in 2017 to obtain his Master's degree in computer vision from Carnegie Mellon University, where he was advised by Prof. Martial Hebert. During this time, he was also an intern at the NVIDIA Autonomous Vehicles Applied Research Group working with Dr. Jose M. Alvarez. From 2019, he was a PhD student in the Autonomous Vision Group at the University of Tübingen, Germany, supervised by Prof. Andreas Geiger. He was selected for the doctoral consortium at ICCV 2023, as a 2023 RSS pioneer, and an outstanding reviewer for CVPR, ICCV, ECCV, and NeurIPS. Since 2020, he has led teams that won awards in nine autonomous driving challenges.
Abstract: Physical AI refers to artificial intelligence that is integrated into a physical body, enabling it to perceive, reason about, and interact directly with the real world. While holding immense potential, current Physical AI systems remain limited by poor generalization in unstructured environments. This talk explores visual generative modeling as a solution, specifically focusing on three emerging "world model" paradigms: Video-Action models, Action-Video models, and Action-Video-Action models. We will review concrete examples of each methodology and discuss a roadmap for overcoming their remaining barriers, moving us towards generalizable intelligence and scalable real-world deployment.
|
| 9:55 - 10:20 |
Fabrizio Frasca
Technion – Israel Institute of Technology
Structure-Aware Learning: From Graph Neural Networks to reliable Foundation Models
Bio: Fabrizio is an Aly Kaufman Postdoctoral Fellow at Technion, hosted by Prof. Haggai Maron. His research develops symmetry-aware, expressive, and scalable learning architectures for structured domains, with recent applications to monitoring foundation models and detecting misbehaviours such as hallucinations. He received his PhD in Computing from Imperial College London under the supervision of Prof. Michael Bronstein. His doctoral work developed principled frameworks addressing fundamental representational limits of Graph Neural Networks, investigating the role of substructures and symmetries in the design of expressive architectures. This work appeared at venues including ICML, ICLR (spotlight) and NeurIPS (oral). Previously, he was a Machine Learning Researcher at Twitter Cortex, joining following the acquisition of the fake-news detection startup Fabula AI. His earlier work includes applications of machine learning in computational biology and biomedicine.
Abstract: As AI systems find increasingly broad application across scientific and societal domains, they encounter structured data featuring inherent symmetries and relational organisation – from molecules and social networks to the internal computations of foundation models themselves. Architectures not designed for such settings risk ignoring important invariances or relying on heuristic design that limits expressiveness. In this talk, I present a line of work developing principled Geometric Deep Learning methods for structured data, centred on three foundational principles: symmetry-awareness, expressiveness, and scalability. I begin with learning on graphs, highlighting fundamental representational limitations of message-passing neural networks and introducing principled, equivariant designs that provably increase expressiveness while remaining computationally efficient on real-world graphs. I then show how these principles extend to emerging data modalities: the computational traces of Large Language Models, including attention maps and layer activations. These data objects represent internal LLM computations and are structured. I propose the design of structure-aware architectures to learn directly from these for the detection of failure modes such as hallucinations. This perspective yields expressive and scalable detectors that move beyond handcrafted heuristics. I conclude by outlining a broader research programme towards effective AI systems that operate reliably across diverse structured domains and data modalities. This encompasses monitoring foundation models via computational traces, systems bridging language modelling with relational learning, and structure-aware architectures for genomics and nucleic acids.
|
| 10:30 - 11:15 |
Coffee Break |
| 11:15 - 11:40 |
Tiago Pimentel
ETH Zürich
Promises and Limitations of Causality for Machine Learning Interpretability
Bio: Tiago Pimentel is a Postdoctoral Researcher at ETH Zürich, working in machine learning interpretability and psycholinguistics. His long-term goal is to understand how humans and machines process language. To this end, his research adopts an interdisciplinary approach, leveraging information theory and causality to study the mechanisms behind model behaviour and human cognition.
Abstract: How can we move from observing what a model does to understanding why it does it? In this talk, I argue that causality is the key to uncovering the mechanisms underlying model predictions. First, I examine a “macro” view of model analysis, showing how econometric tools—such as regression discontinuity or difference-in-differences—can isolate the causal impact of specific design choices, such as tokeniser and training data selection, on models' outputs. Second, I turn to a “micro” view of mechanistic interpretability, focusing on causal abstraction as a method to verify if a model implements a high-level algorithm. I demonstrate that this approach faces a critical limitation: without strict assumptions about how models encode information, the framework becomes vacuous, implying that any model implements any algorithm. This reveals that the ability to predictably intervene on a model is not, on its own, sufficient to guarantee we understand it. I conclude with a short discussion about how causality can be used to develop more principled interpretability methods.
|
| 11:50 - 12:15 |
Xinyue Shen
CISPA Helmholtz Center for Information Security
Securing AI Systems Against Real-World Misuse
Bio: Xinyue Shen is a PhD candidate at CISPA Helmholtz Center for Information Security. Her research interests lie in Trustworthy AI, with a focus on the security, safety, and responsibility of generative AI systems. She publishes at top venues like IEEE S&P, USENIX Security, ACM CCS, ACL, EMNLP, and ICWSM. She is also named KAUST Rising Star in AI 2025, Machine Learning and Systems Rising Star 2025, and a recipient of the Best Machine Learning and Security Paper in Cybersecurity Award.
Abstract: AI systems like ChatGPT have advanced rapidly, yet their misuse has escalated in parallel. However, we still lack a systematic understanding of how AI systems are misused in the real world and why existing defenses repeatedly fail. This gap results in incomplete or misaligned safeguards, leaving individuals and society vulnerable. In this talk, I will share insights into the misuse of real-world AI systems, which involves understanding user-driven misuse in real-world AI systems, proactively detecting and mitigating AI system misuse, and identifying emerging security risks in the broader AI ecosystem.
|
| 12:25 - 12:50 |
Nouha Dziri
Allen Institute for AI
TBD
Bio: TBD
Abstract: TBD
|
| 13:00 - 14:30 |
Lunch |