Publications

Download BibTeX.

2021
July
PDF Exploring the Memorization-Generalization Continuum in Deep Learning.
Ziheng Jiang, Chiyuan Zhang, Kunal Talwar, and Michael C. Mozer.
ICML 2021.
2021
June
PDF Pure, Low-Level Tensor Program Rewriting via Access Patterns (Representation Pearl).
Gus Henry Smith, Andrew Liu, Steven Lyubomirsky, Scott Davidson, Joseph McMahan, Michael B. Taylor, Luis Ceze, and Zachary Tatlock.
Proceedings of the 5th ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL 2021).
2021
June
PDF Porcupine: A Synthesizing Compiler for Vectorized Homomorphic Encryption.
Meghan Cowan, Deeksha Dangwal, Armin Alaghi, Caroline Trippel, Vincent T. Lee, and Brandon Reagen.
PLDI 2021.
2021
June
PDF Reticle: A Virtual Machine for Programming Modern FPGAs.
Luis Vega, Joseph McMahan, Adrian Sampson, Dan Grossman, and Luis Ceze.
PLDI 2021.
2021
May
PDF Dynamic Tensor Rematerialization.
Marisa Kirisame, Steven Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared Roesch, Tianqi Chen, and Zachary Tatlock.
ICLR 2021.
2021
April
PDF Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks.
Chien-Yu Lin, Liang Luo, and Luis Ceze.
arXiv preprint.
2021
April
PDF Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference.
Haichen Shen, Jared Roesch, Zhi Chen, Wei Chen, Yong Wu, Mu Li, Vin Sharma, Zachary Tatlock, and Yida Wang.
MLSys 2021.
2021
March
PDF Automated Backend-Aware Post-Training Quantization.
Ziheng Jiang, Animesh Jain, Andrew Liu, Josh Fromm, Chengqian Ma, Tianqi Chen, and Luis Ceze.
arXiv preprint.
2020
November
PDF Srift: Swift and Thrift Cloud-Based Distributed Training.
Liang Luo, Peter West, Arvind Krishnamurthy, and Luis Ceze.
arXiv preprint.
2020
May
PDF LastLayer: Toward Hardware and Software Continuous Integration.
Luis Vega, Jared Roesch, Joseph McMahan, and Luis Ceze.
IEEE Micro.
2020
March
PDF PLink: Discovering and Exploiting Locality for Accelerated Distributed Training on the public Cloud.
Liang Luo, Peter West, Jacob Nelson, Arvind Krishnamurthy, and Luis Ceze.
MLSys 2020.
2020
March
PDF Riptide: Fast End-to-End Binarized Neural Networks.
Josh Fromm, Meghan Cowan, Matthai Philipose, Luis Ceze, and Shwetak Patel.
MLSys 2020.
2020
February
PDF Automatic Generation of High-Performance Quantized Machine Learning Kernels.
Meghan Cowan, Thierry Moreau, Tianqi Chen, James Bornholt, and Luis Ceze.
CGO 2020.
2019
April
PDF A Hardware-Software Blueprint for Deep Learning Specialization.
Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy.
arXiv preprint.
2019
April
PDF Relay: A High-Level IR for Deep Learning.
Jared Roesch, Steven Lyubomirsky, Marisa Kirisame, Josh Pollock, Logan Weber, Ziheng Jiang, Tianqi Chen, Thierry Moreau, and Zachary Tatlock.
arXiv preprint.
2018
December
PDF Learning to Optimize Tensor Programs.
Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy.
NeurIPS 2018.
2018
November
PDF Automating Generation of Low Precision Deep Learning Operators.
Meghan Cowan, Thierry Moreau, Tianqi Chen, and Luis Ceze.
arXiv preprint.
2018
October
PDF TVM: An Automated End-to-End Optimizing Compiler for Deep Learning.
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy.
OSDI 2018.
2018
October
PDF Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training.
Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee, and Arvind Krishnamurthy.
SOCC 2018.
2018
June
PDF Relay: A New IR for Machine Learning Frameworks.
Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, and Zachary Tatlock.
Proceedings of the 2Nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL 2018).
2018
February
PDF MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators.
Sung Kim, Patrick Howe, Thierry Moreau, Armin Alaghi, Luis Ceze, and Visvesh Sathe.
DATE 2018.
2018
February
PDF Parameter Box: High Performance Parameter Servers for Efficient Distributed Deep Neural Network Training.
Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee, and Arvind Krishnamurthy.
SysML 2018.
2017
July
PDF Fast Video Classification via Adaptive Cascading of Deep Models.
Haichen Shen, Seungyeop Han, Matthai Philipose, and Arvind Krishnamurthy.
CVPR 2017. Spotlight.
2016
August
PDF XGBoost: A Scalable Tree Boosting System.
Tianqi Chen and Carlos Guestrin.
KDD 2016.
2016
June
PDF MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints.
Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy.
MobiSys 2016.
2015
December
PDF MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems.
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang.
LearningSys Workshop at Neural Information Processing Systems 2015.
2015
February
PDF SNNAP: Approximate Computing on Programmable SoCs via Neural Acceleration.
Thierry Moreau, Mark Wyse, Jacob Nelson, Adrian Sampson, Hadi Esmaeilzadeh, Luis Ceze, and Mark Oskin.
HPCA 2015.