Role: Principal Investigator
µTVM brings the power of Apache TVM to bare-metal devices. By building a lightweight and device-agnostic runtime for interacting with microcontrollers, µTVM plugs directly into the TVM stack and provides automatic optimization of ML operators and easy deployment. The figure below gives an idea of where µTVM sits in TVM.
To provide automatic optimization, µTVM makes use of AutoTVM (depicted below). AutoTVM suggests candidate kernel implementations in C, and the candidates are then compiled and loaded onto the device via JTAG. Random inputs are used to time the execution of kernels, which is fed into AutoTVM’s search algorithm. Over time, by intelligently navigating the space of implementations, AutoTVM tailors candidate kernels to the architectural properties of the device, using only timing information.
Every time you ask Amazon Alexa a question, Relay is being used.
Relay is a functional and differentiable intermediate representation for machine learning applications, which ditches the design of traditional computation-graph-based IRs, and instead opts to be a full-fledged programming language. The design is surprisingly similar to the language SML, with the key difference being a tensor-based type system with a lightweight form of dependency.
Relay is tightly integrated with Apache TVM and sits on top of TVM’s low-level tensor expression IR. This split creates a separation of concerns, where Relay orchestrates the high-level flow of models and calls into kernels that have been aggressively optimized in the low-level IR. By using a two-level IR split, Relay significantly outperforms existing machine learning frameworks (shown below).
RelayBench is a framework for running language- and framework-agnostic machine learning experiments, with the primary goal being reproducibility. Once experiments are defined by the user, subsystems can be defined to analyze and make use of the collected data. As a “killer app” for RelayBench, I developed a push-button evaluation for the most recent Relay paper, meaning all of the experiments and graphs were run and generated automatically.