You are here

Memory-Efficient KV Cache Optimization for Large Language Model Inference at the Edge

TitleMemory-Efficient KV Cache Optimization for Large Language Model Inference at the Edge
Publication TypeConference Paper
Year of Publication2026
AuthorsZhang, C., H. Tan, H. Pan, Y. Xu, H. Du, L. Zhang, and X. Fu
Conference NameIEEE INFOCOM 2026
Date Published05/2026
PublisherIEEE
Conference LocationTokyo, Japan