Report on “DIRAC & Rucio Workshop”- powering the distributed computing and data management for large science

In recent particle and nuclear physics experiments and cosmic ray observations, with the aim of discovering new physics beyond the Standard Model, detectors have become larger, more granularity, and higher-rate than ever before. As a result, the amount of data that must be handled continues to increase significantly. Nowadays, in order for experiments to be successful, it is essential to establish and efficiently operate a computational environment that processes data without interruption and continues to produce analysis results. Therefore, distributed computing environments are being used in a wide range of scientific fields, such as the Belle II experiment at KEK, the ATLAS and LHCb experiments at the European Organization for Nuclear Research (CERN), and even gamma-ray astronomy, the Cherenkov Telescope Array Observatory (CTAO).

Here, a distributed computing environment refers to a system in which computing resources operated at computer sites scattered around the world are connected via high-speed network and used as if they were a gigantic computer on a global scale. “Grid computing” is one of them. However, the systems operated at each computer site are diverse, ranging from small-scale clusters to large-scale batch systems, cloud computing, and HPC (High Performance Computers). In order to make these different computing resources interoperable, an interware called DIRAC was developed by LHCb experiment, and it has become possible to manage both jobs and data. Additionally, a distributed data management tool called Rucio, which centrally manages and uses data distributed across multiple computer sites, was also developed around ATLAS. The Belle II experiment employs both of these as basic tools.

From October 16th to 20th, the DIRAC and Rucio joint workshop was held at KEK , with a total of about 40 experts from the United States, Switzerland, France, the United Kingdom, Italy, Russia, Mexico, China, and Japan gathering. This will be the first time that the workshops, which have previously been held independently, will be held together. The aim is to realize a better distributed computing environment by sharing the technologies, experiences, and problems that each party has cultivated. The demand for distributed computing environments and distributed data management in various scientific fields is expected to grow even more in the future. This workshop served as a good opportunity for DIRAC and Rucio to go beyond the scope of particle and nuclear experiments and be applied in a wider range of scientific fields.