Sra Nips Workshop

“RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval,” has been accepted for an oral (spotlight) presentation at the NeurIPS 2024 ENLSP Workshop. Best Paper. Paper. This paper is accepted at NeurIPS 2025 Main Track in September 2025.