Publications

Download BibTeX.

2018
PDF Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training.
Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee, and , Arvind Krishnamurthy.
SOCC 2018.
2018
July
PDF VTA: An Open Hardware-Software Stack for Deep Learning.
Thierry Moreau, Tianqi Chen, Ziheng Jiang, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy.
arXiv preprint.
2018
June
PDF Relay: A New IR for Machine Learning Frameworks.
Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, and Zachary Tatlock.
Proceedings of the 2Nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL 2018).
2018
May
PDF Learning to Optimize Tensor Programs.
Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy.
NIPS 2018(to appear).
2018
April
PDF TVM: An Automated End-to-End Optimizing Compiler for Deep Learning.
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy.
OSDI 2018(to appear).
2018
February
PDF MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators.
Sung Kim, Patrick Howe, Thierry Moreau, Armin Alaghi, Luis Ceze, and Visvesh Sathe.
DATE 2018.
2018
February
PDF Parameter Hub: High Performance Parameter Servers for Efficient Distributed Deep Neural Network Training.
Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee, and Arvind Krishnamurthy.
SysML 2018.
2017
July
PDF Fast Video Classification via Adaptive Cascading of Deep Models.
Haichen Shen, Seungyeop Han, Matthai Philipose, and Arvind Krishnamurthy.
CVPR 2017. Spotlight.
2016
August
PDF XGBoost: A Scalable Tree Boosting System.
Tianqi Chen and Carlos Guestrin.
KDD 2016.
2016
June
PDF MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints.
Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy.
MobiSys 2016.
2015
December
PDF MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems.
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang.
LearningSys Workshop at NIPS 2015.
2015
February
PDF SNNAP: Approximate Computing on Programmable SoCs via Neural Acceleration.
Thierry Moreau, Mark Wyse, Jacob Nelson, Adrian Sampson, Hadi Esmaeilzadeh, Luis Ceze, and Mark Oskin.
HPCA 2015.