· Contributors · Organizations · Search
Automatic Asynchronous Execution of Synchronously Offloaded OpenMP Target Regions
DescriptionUse of heterogeneous architectures has steadily increased during the past decade. However, non-homogeneous systems present a challenge to the programming model as the execution models between CPU and accelerator might differ considerably. OpenMP, since version 4.0, has been trying to bridge this gap by allowing to offload a code block to a target device. Among the additions to the OpenMP offloading API since, the most notably probably is asynchronous execution between device and host. By default, offloaded regions are executed synchronously, thus the host thread blocks until their completion. The nowait clause allows work to overlap between the host and target device. However, nowait must be manually added by the user, along with the tasks data dependencies and appropriate synchronization to avoid race conditions, increasing the program complexity and developer burden.