Low-Precision Multi-GPU Detection Approach for Massive MIMO Technology
SessionResearch Posters Display
DescriptionMassive Multiple-Input-Multiple-Output is a crucial technology for Next-Generation networks (Next-G). It uses hundreds of antennas at transceivers to exchange data. However, its accurate signal detection relies on solving an NP-hard optimization problem in real-time latency.
In this poster, we propose a new GPU-based detection algorithm that demonstrates the positive impact of low-precision arithmetic with multiple GPUs to achieve next-G latency/scalability/accuracy requirements. Our approach iteratively extends a solution with several symbols representing the best combination out of the aggregated levels. The computation at each iteration is formulated as a matrix multiplication operation to leverage GPU architectures.
The obtained results using A100 GPU show a 1.7x improvement by exploiting half-precision arithmetic without loss in accuracy. Furthermore, our low-precision multi-GPU version with four A100 GPUs is 4x faster than the single-precision single GPU version and 40x faster than a similar parallel CPU implementation executed on a two-socket 28-core IceLake CPU with 56 threads.
In this poster, we propose a new GPU-based detection algorithm that demonstrates the positive impact of low-precision arithmetic with multiple GPUs to achieve next-G latency/scalability/accuracy requirements. Our approach iteratively extends a solution with several symbols representing the best combination out of the aggregated levels. The computation at each iteration is formulated as a matrix multiplication operation to leverage GPU architectures.
The obtained results using A100 GPU show a 1.7x improvement by exploiting half-precision arithmetic without loss in accuracy. Furthermore, our low-precision multi-GPU version with four A100 GPUs is 4x faster than the single-precision single GPU version and 40x faster than a similar parallel CPU implementation executed on a two-socket 28-core IceLake CPU with 56 threads.