Workshop: 2022 International Workshop on Performance Portability and Productivity (P3HPC)
Authors: Kumudha Narasimhan, Ouadie El Farouki, Mehdi Goli, Muhammad Tanvir, Svetlozar Georgiev, and Isaac Ault (Codeplay Software Ltd, UK)
Abstract: The wide adoption of Deep Neural Networks (DNN) has served as an incentive to design and manufacture powerful and specialized hardware technologies, targeting systems from Edge devices to Cloud and supercomputers.
While the proposed ONNX as a de facto for AI model description provides the portability of AI models across various AI frameworks, supporting DNN models on various hardware architectures remains challenging.
SYCL provides a C++-based portable parallel programming model to target various devices. Thus, enabling SYCL backend for an AI framework can lead to a hardware-agnostic model for heterogeneous systems.
This paper proposes a SYCL backend for ONNXRuntime as a possible solution towards the performance portability of deep learning algorithms. The proposed backend uses existing state-of-the-art SYCL-DNN and SYCL-BLAS libraries to invoke tuned SYCL kernels for DNN operations. Our performance evaluation shows that the proposed approach can achieve comparable performance with respect to the state-of-the-art optimized vendor-specific libraries.