You are here

Breaking Memory Wall for Fast Edge LLM Inference Using Contextual Sparsity

TitleBreaking Memory Wall for Fast Edge LLM Inference Using Contextual Sparsity
Publication TypeJournal Article
Year of Publication2026
AuthorsWei, Z., Y. Zhou, J. Zhao, K. Li, and X. Fu
JournalIEEE Transactions on Mobile Computing
Date Published06/2026
Type of Articleaccepted