Workshop: The 17th Workshop on Workflows in Support of Large-Scale Science (WORKS22)
Authors: Nicholas Tyler (National Energy Research Scientific Computing Center (NERSC)) and Robert Knop, Deborah Bard, and Peter Nugent (Lawrence Berkeley National Laboratory (LBNL))
Abstract: Experimental and observational science pipelines are increasingly turning to supercomputing resources to handle their large-scale data analysis. Many of these pipelines serve experiments that are running 24/7, and must shutdown or find alternatives for their real-time data analysis during outages. Workflows from experimental and observational facilities are usually architected with a specific network and computing facility in mind, and are very difficult to switch between compute resources. What's more, the assumptions built into the architecture of most high-performance computing (HPC) centers makes moving workflows to new locations more complicated. By carefully targeting well-understood cosmology and genomics pipelines, we have researched the capabilities needed to run these workflows at multiple computing sites. In this process, we have identified several of the pain points and key future research topics for automated workflow migration, and have made substantial progress towards a future where fully automated workflows can run across the DOE complex.