Checkpointing deep learning models as a dynamic analysis
Low-level Intermediate Representation (IR) for Programming Modern FPGAs
Hardware-software partition exploration with e-graphs.
Parameter Server for Efficient Distributed Deep Neural Network Training for Clusters, Datacenters, and the Public Clouds
High level IR for optimizing machine learning models.
Hardware/Software Deep Learning Acceleration Stack
Fast Video Classification via Adaptive Cascading of Deep Models
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
A Scalable Tree Boosting System