Lossy Compression for Scientific Data
DescriptionLarge-scale numerical simulations, observations, experiments, and AI computations are generating or consuming very large datasets that are difficult to analyze, store, and transfer. Data compression is an attractive and efficient technique to significantly reduce scientific datasets. This tutorial reviews the state of the art in lossy compression of scientific datasets, covers the main compression techniques (e.g. decomposition, transforms, prediction, sampling, precision reduction, etc.) and discusses in detail lossy compressors (SZ, ZFP, TThresh, LibPressio), compression error assessment metrics, and the Z-checker tool to analyze the compression error. The tutorial addresses the following questions: Why lossless and lossy compression? How does compression work? How to measure and control compression error? What are the current use cases for simulations, experiments, and AI computations? The tutorial uses examples of real-world scientific datasets to illustrate the different compression techniques and their performance. From a participant perspective, the tutorial will detail how to use compression software as executables and as modules integrated in parallel I/O libraries (ADIOS, HDF5). This half-day tutorial, given by two of the leading teams in this domain and targeting primarily beginners interested in learning about lossy compression for scientific data, is improved from the highly rated tutorials given at ISC17-21 and SC17-21.
Event Type
TimeMonday, 14 November 20228:30am - 12pm CST
Registration Categories
Big Data
Computational Science
Data Analytics
Data Mangement
File Systems and I/O
Session Formats
Back To Top Button