Sra Nips Workshop
“RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval,” has been accepted for an oral (spotlight) presentation at the NeurIPS 2024 ENLSP Workshop. Best Paper. Paper.
“RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval,” has been accepted for an oral (spotlight) presentation at the NeurIPS 2024 ENLSP Workshop. Best Paper. Paper.