Home
People
Research
Blog
Publications
Talks
Code
Punica
Punica is our latest LLM serving framework that can serve multiple LoRA finetuned LLM as one:
Preprint:
Punica: Multi-Tenant LoRA Serving
Blog post:
Potentials of Multitenancy Fine-Tuned LLM Serving
Github Repo:
Punica
People
Lequn Chen
Zihao Ye
Yongji Wu
Y
Yongji Wu
Danyang Zhuo
D
Danyang Zhuo
Assistant Professor - Duke
Luis Ceze
Professor
Arvind Krishnamurthy
Professor