Authors: Santosh Pandey (Stevens Institute of Technology); Lingda Li, Thomas Flynn, and Adolfy Hoisie (Brookhaven National Laboratory); and Hang Liu (Stevens Institute of Technology)
Abstract: Cycle-accurate microarchitecture simulators are essential tools to architect new processors. But they are often replaced by alternative methodologies such as statistical or analytical modeling for shorter turnaround time. There have also been attempts to employ ML to perform architecture simulations, such as Ithemal and SimNet but existing solutions may be even slower due to intrinsic computational intensity and memory traffic challenges.
This paper proposes the first GPU-based microarchitecture simulator that unleashes the GPU's potential to accelerate the state-of-the-art ML-based simulators. First, we introduce an efficient GPU implementation that minimizes data movement and customizes state-of-the-art ML inference engines to achieve rapid single instruction simulation for SimNet. Second, we propose a parallel simulation paradigm that partitions a trace into sub-traces to simulate them in parallel with rigorous error analysis and effective error correction mechanisms. Combined, our GPU-based simulator outperforms traditional CPU-based simulators significantly, i.e., up to 1014x speedup over gem5 detailed simulation.
Back to Technical Papers Archive Listing