Workshop: Eighth International Workshop on Heterogeneous High-Performance Reconfigurable Computing (H2RC 2022)
Authors: Yousef Alnaser (Fraunhofer Institute for Electronic Nano Systems); Jan Langer (Fraunhofer Institute for Electronic Nano Systems, Chemnitz University of Technology); and Martin Stoll (Chemnitz University of Technology)
Abstract: In this work, we accelerate the Kernel Ridge Regression algorithm on an adaptive computing platform to achieve higher performance within faster development time by employing a design approach using high-level synthesis. In order to avoid storing the potentially huge kernel matrix in external memory, the designed accelerator computes the matrix on-the-fly in each iteration. Moreover, we overcome the memory bandwidth limitation by partitioning the kernel matrix into smaller tiles that are pre-fetched to small local memories and reused multiple times. The design is also parallelized and fully pipelined to accomplish the highest performance. The final accelerator can be used for any large-scale data without kernel matrix storage limitations and with an arbitrary number of features. This work is an important first step towards a library for accelerating different Kernel methods for Machine Learning applications for FPGA platforms that can be used conveniently from Python with a NumPy interface.