Improving ML Model Portability, Productivity, and Performance with Apache TVM and the OctoML Platform
DescriptionThere is a pressing need to bring machine learning to a diverse set of hardware devices. Current approaches typically rely on vendor-specific operator libraries and frameworks, and require significant engineering effort. In this talk, we will present an overview of the Apache TVM open source stack, which exposes graph- and operator-level optimizations to provide performance portability for machine learning workloads across diverse hardware back-ends. TVM solves compiler optimization challenges by employing a learning-based approach for rapid exploration of optimizations, saving months of engineering time and offering state-of-the-art performance in both edge and server use cases. We will discuss how TVM offers broad model coverage, and makes effective use of hardware resources. We will end the talk with a peek at the OctoML Platform which brings DevOps agility to ML deployment.
Event Type
Invited Talk
TimeThursday, 17 November 202210:30am - 11:15am CST
LocationDallas Ballroom/Omni Hotel
TP
XO/EX
Recorded
Presenter