Authors: Jose Moreira (IBM TJ Watson Research Center), Antonino Tumeo (Pacific Northwest National Laboratory (PNNL)), John Feo (Pacific Northwest National Laboratory (PNNL)), Timothy Mattson (Intel Corporation)
Abstract: Government agencies, industry and academia are demanding a new generation of tools to efficiently solve large scale analytics problems in a variety of business, scientific and national security applications. This BoF gathers the community developing high-performance frameworks and workflows for large scale graph analytics to survey current approaches, identify new challenges and opportunities, and discuss interoperability of emerging infrastructures. A central goal is developing requirements and recommendations for future tools. As in previous editions, this BoF will explore, and compare and contrast conventional implementations as well as algebraic approaches, inviting the GraphBLAS community to discuss its state and evolution.
Long Description: Activity in graph analytics is growing rapidly in government, industry and academia. Large scale graph problems require ever growing compute power and impose significant requirements on modern supercomputing architectures. The development of graph toolkits and libraries, their interoperability, and composability with other analytic platforms is critical to many scientific, data, and security domains. This BOF, held previously at SC17, ’18, ’19, and ’21, has consistently attracted over 100 attendees (attendance was lower in ’21 as we supported a hybrid format) and dozens of requests from acclaimed researchers and practitioners to speak. The panel sessions have been lively and intense.
This BOF gathers the community developing high-performance frameworks and workflows for large scale graph analytics to survey current approaches, identify new challenges and opportunities, and discuss interoperability of emerging infrastructures. A central goal is developing requirements and recommendations for future tools. In particular, we want to address the new and upcoming challenges in large scale graph analytics applications: the support for streaming graphs, the ability to deal with attributed graphs (that couple graphs with dense tables of attributes), the need to integrate the graph methods within broader machine learning frameworks, and the need to better support irregular data structures and graph methods in scientific simulation frameworks. Current and future graph toolkits will have to evolve to handle these new requirements and domains.
The BOF will include the latest work of the GraphBLAS community reporting on the performance and capabilities of extent implementations, the pros and cons of algebraic approaches, and the status of the new GraphBLAS C++ API. The GraphBLAS user community will present on key design patterns and requested features.
Our lineup of speakers will touch key themes such as applications, use cases, programming models, application programming interfaces and libraries, data structures and algorithms, and integration of tools, including common data structures, data storages, and data frames. The discussions and panels will delve into the dynamic runtime technologies needed to make graph toolkits and/or sparse linear algebra approaches execute efficiently. While remaining vendor agnostic, we expect to touch also architectural requirements and architectural support for such runtime technologies and workloads.
For the 2022 BOF, we expect to further continue the discussion on representative benchmarks for irregular workloads and their interplay with scientific discovery and data science workflows that were initiated in the previous editions. The definition of requirements and recommendations for these representative benchmarks will naturally lead the community to identify how the expertise in codesigning solutions for combinatorial and randomized graph algorithm could be translated to methodologies for supporting the emerging class of graph representation learning algorithms.
URL: https://hpc.pnl.gov/BOF/
Back to Birds of a Feather Archive Listing