Compression for Scientific Data
Event Type
Big Data Analytics
TimeSunday, June 16th9am - 1pm
LocationMatterhorn 1
DescriptionLarge-scale numerical simulations, observations and experiments are generating very large datasets that are difficult to analyze, store and transfer. Data compression is an attractive and efficient technique to significantly reduce the size of scientific datasets. This tutorial reviews the state of the art in lossy compression of scientific datasets, discusses in detail two lossy compressors (SZ and ZFP), introduces compression error assessment metrics and the Z-checker tool to analyze the difference between initial and decompressed datasets. The tutorial will offer hands-on exercises using SZ and ZFP as well as Z-checker. The tutorial addresses the following questions: Why lossless and lossy compression? How does compression work? How measure and control compression error? The tutorial uses examples of real-world compressors and scientific datasets to illustrate the different compression techniques and their performance. Participants will also have the opportunity to learn how to use SZ, ZFP and Z-checker for their own datasets. The tutorial is given by two of the leading teams in this domain and targets primarily beginners interested in learning about lossy compression for scientific data. This half-day tutorial is improved from the evaluations of the highly rated tutorials given on this topic at ISC17, SC17 and SC18.
Content Level 70% beginner, 20% intermediate, 10% advanced.
Target Audience-Researchers, students and users of high-performance computing and scientific instruments; -Engineers working in industries generating high volumes of data from measurements or simulations (automotive, oil and gas, pharma, etc.); -Researchers and students involved in research using or developing new data reduction techniques;
PrerequisitesParticipants are supposed to bring their own laptop, running Linux or MAC OS X. No previous knowledge in compression or programming language is needed.