Accelerating Kernel Ridge Regression with Conjugate Gradient Method for Large-Scale Data Using FPGA High-Level Synthesis
DescriptionIn this work, we accelerate the Kernel Ridge Regression algorithm on an adaptive computing platform to achieve higher performance within faster development time by employing a design approach using high-level synthesis. In order to avoid storing the potentially huge kernel matrix in external memory, the designed accelerator computes the matrix on-the-fly in each iteration. Moreover, we overcome the memory bandwidth limitation by partitioning the kernel matrix into smaller tiles that are pre-fetched to small local memories and reused multiple times. The design is also parallelized and fully pipelined to accomplish the highest performance. The final accelerator can be used for any large-scale data without kernel matrix storage limitations and with an arbitrary number of features. This work is an important first step towards a library for accelerating different Kernel methods for Machine Learning applications for FPGA platforms that can be used conveniently from Python with a NumPy interface.
TimeMonday, 14 November 202211:30am - 12pm CST