Publications

You can also find my articles on my Google Scholar profile.

Journal Articles

Regret Bounds for Satisficing in Multi-Armed Bandit Problems

Published in Transactions on Machine Learning Research, 2023

This paper explores the concept of satisficing in multi-armed bandit problems, where we aim to find solutions that exceed a satisfaction threshold rather than seeking optimal outcomes.

Recommended citation: Michel, T., Hajiabolhassan, H., & Ortner, R. (2023). "Regret Bounds for Satisficing in Multi-Armed Bandit Problems." Transactions on Machine Learning Research.
Download Paper

Conference Papers

DP-SPRT: Differentially Private Sequential Probability Ratio Tests

Published in International Conference on Artificial Intelligence and Statistics (AISTATS), 2026

We revisit Wald’s celebrated Sequential Probability Ratio Test for sequential tests of two simple hypotheses, under privacy constraints. We propose DP-SPRT, a wrapper that can be calibrated to achieve desired error probabilities and privacy constraints, addressing a gap in previous work. DP-SPRT relies on a private mechanism that processes a sequence of queries and stops after privately determining when the query results fall outside a predefined interval. This OutsideInterval mechanism improves upon naive composition of existing techniques like AboveThreshold, achieving a factor-of-2 privacy improvement and thus potentially benefiting other continual monitoring procedures. We prove generic upper bounds on the error and sample complexity of DP-SPRT that can accommodate various noise distributions based on the practitioner’s privacy needs. We exemplify them in two settings: Laplace noise (pure Differential Privacy) and Gaussian noise (Rényi differential privacy). In the former setting, by providing a lower bound on the sample complexity of any $\varepsilon$-DP test with prescribed type I and type II errors, we show that DP-SPRT is near optimal when both errors are small and the two hypotheses are close. Moreover, we conduct an experimental study revealing its good practical performance.

Recommended citation: Michel, T., Basu, D., & Kaufmann, E. (2026). "DP-SPRT: Differentially Private Sequential Probability Ratio Tests." The 29th International Conference on Artificial Intelligence and Statistics (AISTATS). Spotlight presentation.
Download Paper

Exploring Flow-Lenia Universes with a Curiosity-driven AI Scientist: Discovering Diverse Ecosystem Dynamics

Published in Artificial Life Conference Proceedings 37 (ALIFE), 2025

We present a method for the automated discovery of system-level dynamics in Flow-Lenia—a continuous cellular automaton (CA) with mass conservation and parameter localization—using a curiosity-driven AI scientist. This method aims to uncover processes leading to self-organization of evolutionary and ecosystemic dynamics in CAs. We build on previous work which uses diversity search algorithms in Lenia to find self-organized individual patterns, and extend it to large environments that support distinct interacting patterns. We adapt Intrinsically Motivated Goal Exploration Processes (IMGEPs) to drive exploration of diverse Flow-Lenia environments using simulation-wide metrics, such as evolutionary activity, compression-based complexity, and multi-scale entropy. We test our method in two experiments, showcasing its ability to illuminate significantly more diverse dynamics compared to random search. We show qualitative results illustrating how ecosystemic simulations enable self-organization of complex collective behaviors not captured by previous individual pattern search and analysis. We complement automated discovery with an interactive exploration tool, creating an effective human-AI collaborative workflow for scientific investigation. Though demonstrated specifically with Flow-Lenia, this methodology provides a framework potentially applicable to other parameterizable complex systems where understanding emergent collective properties is of interest.

Recommended citation: Michel, T., Cvjetko, M., Hamon, G., Oudeyer, P.-Y., & Moulin-Frier, C. (2025). "Exploring Flow-Lenia Universes with a Curiosity-driven AI Scientist: Discovering Diverse Ecosystem Dynamics." Artificial Life Conference Proceedings 37, 2025(1), 68. MIT Press.
Download Paper

Preprints

Sequential Membership Inference Attacks

Published in arXiv preprint, 2026

Modern AI models are not static. They go through multiple updates in their lifecycles. We propose to design Sequential Membership Inference (SeMI) attacks leading to tighter privacy audits by exploiting the sequence of models and injecting a target canary at a controlled insertion time. First, for empirical mean computation, we develop SeMI, an optimal SeMI attack to identify the presence of a target inserted at a specific insertion step. We derive the power of SeMI to show that accessing the model sequence yields more powerful MI attacks than scrutinising only the final model. SeMI* exhibits an isolation property—its power depends on the statistics obtained right before and after insertion of the target. Leveraging this insight, we develop practical white-box (accessing model gradients) and black-box (accessing loss) SeMI attacks against models trained with (DP-)SGD. Across datasets and models trained with (DP-)SGD, our experiments show that SeMI attacks achieve higher powers than snapshot-independent baselines, and yield tighter privacy audits thanks to (a) control over the insertion time and (b) observations across the model sequence.

Recommended citation: Michel, T., Basu, D., & Kaufmann, E. (2026). "Sequential Membership Inference Attacks." arXiv preprint arXiv:2602.16596.
Download Paper

Octax: Accelerated CHIP-8 Arcade Environments for Reinforcement Learning in JAX

Published in arXiv preprint, 2025

Reinforcement learning (RL) research requires diverse, challenging environments that are both tractable and scalable. While modern video games may offer rich dynamics, they are computationally expensive and poorly suited for large-scale experimentation due to their CPU-bound execution. We introduce Octax, a high-performance suite of classic arcade game environments implemented in JAX, based on CHIP-8 emulation, a predecessor to Atari, which is widely adopted as a benchmark in RL research. Octax provides the JAX community with a long-awaited end-to-end GPU alternative to the Atari benchmark, offering image-based environments, spanning puzzle, action, and strategy genres, all executable at massive scale on modern GPUs. Our JAX-based implementation achieves orders-of-magnitude speedups over traditional CPU emulators while maintaining perfect fidelity to the original game mechanics. We demonstrate Octax’s capabilities by training RL agents across multiple games, showing significant improvements in training speed and scalability compared to existing solutions. The environment’s modular design enables researchers to easily extend the suite with new games or generate novel environments using large language models, making it an ideal platform for large-scale RL experimentation.

Recommended citation: Radji, W., Michel, T., & Piteau, H. (2025). "Octax: Accelerated CHIP-8 Arcade Environments for Reinforcement Learning in JAX." arXiv preprint arXiv:2510.01764.
Download Paper

Thomas Michel