SAMPL is an interdisciplinary machine learning research group exploring problems spanning multiple layers of the system stack including deep learning frameworks, specialized hardware for training and inference, new intermediate representations, differentiable programming, and various applications. We are part of the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Our group is a collaboration between researchers from Sampa, Syslab, PLSE, EFESLab and CMU Catalyst.
CPU-GPU Orchestration for Fast Inference of MoE Models
Serving multiple LoRA finetuned LLM as one
Low-bit Quantization for Efficient and Accurate LLM Serving
Kernel Library for LLM Serving
Compiler for Sparsity in Deep Learning
Checkpointing deep learning models as a dynamic analysis
Low-level Intermediate Representation (IR) for Programming Modern FPGAs
Hardware-software partition exploration with e-graphs.
Parameter Server for Efficient Distributed Deep Neural Network Training for Clusters, Datacenters, and the Public Clouds
High level IR for optimizing machine learning models.
Hardware/Software Deep Learning Acceleration Stack
Fast Video Classification via Adaptive Cascading of Deep Models
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
A Scalable Tree Boosting System