Research topics

I am interested in developing tools and methods that aim at debugging and optimizing HPC applications.

I have developed PARCOACH, a static/dynamic tool that detects collective errors in parallel applications. The static part identifies the reduced set of collective communications that may eventually lead to potential deadlock situations, and issues warnings. Using this analysis, a selective instrumentation of the code is then achieved, displaying an error, synchronously interrupting all processes, if the schedule leads to a deadlock situation.

PARCOACH is implemented as a LLVM pass and is still under development.

Grants

  • MICROCARD European project, H2020 - Participant in WP2

  • Plan de relance with Eviden

    • Vérifications des applications MPI-RMA (2022-2023) - PI, 72k€

    • Méthodes pour l'analyse des pannes reposant sur des techniques de types statistical learning (2022-2024) - coPI, 47k€

  • Inria's Exploratory Action LLM4DiCE, Large Language Models for Detection and Correction of Errors (2024-2027) - scientific leader

Former projects

  • COHPC : COrrectness and performance of HPC applications, Inria associate team (2019-2022) - French PI, 10k€/year

  • Exacard, ANR 2018 - Participant

  • HAC SPECIS, Inria project lab on High-performance Application and Computers: Studying PErformance and Correctness In Simulation (2016-2020) - Participant

Research Software

  • PARallel COntrol flow Anomaly CHecker (PARCOACH) aims at helping developers in their debugging phase of parallel and distributed applications. Read more about PARCOACH

  • The MPI Bugs Initiative (MBI) is a framework that creates a MPI correctness benchmark suite and evaluates MPI verification tools. Gitlab repository

Open Internship Positions (contact me if you are interested)

- None