Matthieu Dorier

Matthieu Dorier

Software Development Specialist

TechTrans International, for Argonne National Laboratory

mdorier@anl.gov

Research interests

My research interests revolve around data movements in high-performance computing, more specifically I/O, storage, in situ analysis and visualization, and communication algorithms. In the past, I have also played with cloud storage, in particular in the context of MapReduce applications.

Input/Output for HPC simulations

In the past few years, Input/Output (I/O) have become a major bottleneck in large-scale HPC applications. The increasing gap between the computation performance, soon reaching the Exascale, and the performance of storage systems, puts more and more pressure on parallel file systems. This pressure leads to increased run time, variability, and energy consumption. I this research direction, I propose new approaches to I/O, in particular based on the use of dedicated cores or dedicated nodes. I have developed Damaris, a middleware for efficient I/O for HPC simulations. Damaris was at the core of my PhD thesis. It is now supported by an Inria ADT in the KerData team of Inria Rennes.

In situ analysis and visualization

In situ analysis and visualization is a new paradigm that proposes to couple running simulations with visualization and analysis tools. This enables direct access to the simulation’s data while it is being produced, without the need for transiting through a parallel file system. My work in this context led me, in particular, to enable in situ visualization in the Damaris middleware through a direct connection with the VisIt software for parallel visualization. Since 2015, Damaris is available within the VisIt package as an interface to simulations.

Modeling and prediction of I/O activity

Parallel storage systems consist of many components (disks, servers, networks, etc.) that have to take decisions when receving requests from HPC applications. These decisions can be optimized by building a model of the application’s behavior and predict future requests. To this end, I proposed the Omnisc’IO approach, which is based on formal grammars to model and predict the I/O behavior of HPC applications as they run.

Collective algorithms in novel network topologies

Post-petascale machines feature new network topologies such as Dragonfly. In this context, it becomes necessary to redesign collective algorithms such as broadcast, allgather, or reduce, in order to make the most efficient use of the topology. My work led me to propose new collective algorithms specifically designed for the Dragonfly network topology and to evaluate them using event-driven simulations.

Collaborations

  • Joint Laboratory for Extreme-Scale Computing (JLESC)
    (2009 - Present)
    Inria, ANL, UIUC, JSC, BSC, RIKEN-AICS

    URL  

    The purpose of the Joint Laboratory for Extreme Scale Computing (JLESC) is to be an international, virtual organization whose goal is to enhance the ability of member organizations and investigators to make the bridge between Petascale and Extreme computing.

  • Data@Exascale associated team
    (2013 - Present)
    KerData (Inria Rennes, IRISA), ANL, UIUC

    URL  

    The team addresses the area of large scale data management for post-petascale supercomputers and for clouds. We aim to investigate several open issues related to storage and I/O in HPC, but also in situ data visualization and analysis from large scale simulations.

  • A Software Defined Storage Approach to Exascale Storage Services
    (2015 - Present)
    ANL, LANL, CMU, HDF Group

    URL  

    DOE-funded project working on designing efficient building blocks for data services in HPC systems. Our work revolves around the design of tools such as the Mercury RPC library, DeltaFS, Argobots, Margo, etc.

  • Compute on Data Path
    (2015 - 2017)
    Texas Tech, Northwestern University, University of Houston, Oakland University, NSF

    URL  

    This project combats the increasingly critical data movement challenge in high performance computing. This project studies the feasibility of a new Compute on Data Path methodology that expects to improve the performance and energy efficiency for high performance computing.