Assessing the Memory Wall in Complex Codes
DescriptionMany of Los Alamos National Laboratory's HPC codes are memory bandwidth bound. These codes exhibit high levels of sparse memory
access which differ significantly from standard benchmarks.
In this paper we present an analysis of the memory access of some of our most important code-bases. We then generate micro-benchmarks
that preserve the memory access characteristics of our codes using two approaches,
one based on statistical sampling of relative memory offsets in a sliding time window at the
function level and another at the loop level. The function level approach is used to
assess the impact of advanced memory technologies such as LPDDR5 and HBM3 using
the gem5 simulator. Our simulation results show significant improvements for sparse memory access workloads using HBM3 relative to LPDDR5 and better scaling on a per core basis. Assessment of two different architectures show that higher peak memory bandwidth results in high bandwidth on sparse workloads.
access which differ significantly from standard benchmarks.
In this paper we present an analysis of the memory access of some of our most important code-bases. We then generate micro-benchmarks
that preserve the memory access characteristics of our codes using two approaches,
one based on statistical sampling of relative memory offsets in a sliding time window at the
function level and another at the loop level. The function level approach is used to
assess the impact of advanced memory technologies such as LPDDR5 and HBM3 using
the gem5 simulator. Our simulation results show significant improvements for sparse memory access workloads using HBM3 relative to LPDDR5 and better scaling on a per core basis. Assessment of two different architectures show that higher peak memory bandwidth results in high bandwidth on sparse workloads.
Event Type
Workshop
TimeSunday, 13 November 20224:30pm - 5pm CST
LocationD222
W
Recorded