benty-fields - Search paper

8241. A Simple and Effective Framework for Symmetric Consistent Indexing in Large-Scale Dense Retrieval

Huimu Wang, Yiming Qiu, Xingzhi Yao, Zhiguo Chen, Guoyu Tang, Songlin Wang, Sulong Xu, Mingming Li

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.13074v1

Vote

Add to Library

Recommend

8242. SPAR: Session-based Pipeline for Adaptive Retrieval on Legacy File Systems

Duy A. Nguyen, Hai H. Do, Minh Doan, Minh N. Do

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12938v1

Vote

Add to Library

Recommend

8243. SignRAG: A Retrieval-Augmented System for Scalable Zero-Shot Road Sign Recognition

Minghao Zhu, Zhihao Zhang, Anmol Sidhu, Keith Redmill

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12885v1

Vote

Add to Library

Recommend

8244. Breaking the Curse of Dimensionality: On the Stability of Modern Vector Retrieval

Vihan Lakshman, Blaise Munyampirwa, Julian Shun, Benjamin Coleman

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12458v1

Vote

Add to Library

Recommend

8245. V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval

Donghyuk Kim, Sejeong Yang, Wonjin Shin, Joo-Young Kim

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12284v1

Streaming video large language models (LLMs) are increasingly used for real-time multimodal tasks such as video captioning, question answering, conversational agents, and augmented reality. However, these models face fundamental memory and computational challenges because their key-value (KV) caches grow substantially with continuous streaming video input. This process requires an iterative prefill stage, which is a unique feature of streaming video LLMs. Due to its iterative prefill stage, it suffers from significant limitations, including extensive computation, substantial data transfer, and degradation in accuracy. Crucially, this issue is exacerbated for edge deployment, which is the primary target for these models. In this work, we propose V-Rex, the first software-hardware co-designed accelerator that comprehensively addresses both algorithmic and hardware bottlenecks in streaming video LLM inference. At its core, V-Rex introduces ReSV, a training-free dynamic KV cache retrieval algorithm. ReSV exploits temporal and spatial similarity-based token clustering to reduce excessive KV cache memory across video frames. To fully realize these algorithmic benefits, V-Rex offers a compact, low-latency hardware accelerator with a dynamic KV cache retrieval engine (DRE), featuring bit-level and early-exit based computing units. V-Rex achieves unprecedented real-time of 3.9-8.3 FPS and energy-efficient streaming video LLM inference on edge deployment with negligible accuracy loss. While DRE only accounts for 2.2% power and 2.0% area, the system delivers 1.9-19.7x speedup and 3.1-18.5x energy efficiency improvements over AGX Orin GPU. This work is the first to comprehensively tackle KV cache retrieval across algorithms and hardware, enabling real-time streaming video LLM inference on resource-constrained edge devices.
Authors' comments: 14 pages, 20 figures, conference

Vote

Add to Library

Recommend

8246. Citation-Grounded Code Comprehension: Preventing LLM Hallucination Through Hybrid Retrieval and Graph-Augmented Context

Jahidul Arafat

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12117v1

Vote

Add to Library

Recommend

8247. FloodSQL-Bench: A Retrieval-Augmented Benchmark for Geospatially-Grounded Text-to-SQL

Hanzhou Liu, Kai Yin, Zhitong Chen, Chenyue Liu, Ali Mostafavi

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12084v1

Vote

Add to Library

Recommend

8248. Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval

Grant Bosworth, Keewoo Lee, Sunwoong Kim

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.11690v1

Vote

Add to Library

Recommend

8249. LOOPRAG: Enhancing Loop Transformation Optimization with Retrieval-Augmented Large Language Models

Yijie Zhi, Yayu Cao, Jianhua Dai, Xiaoyang Han, Jingwen Pu, Qingran Wu, Sheng Cheng, Ming Cai

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.15766v1

Loop transformations are semantics-preserving optimization techniques, widely used to maximize objectives such as parallelism. Despite decades of research, applying the optimal composition of loop transformations remains challenging due to inherent complexities, including cost modeling for optimization objectives. Recent studies have explored the potential of Large Language Models (LLMs) for code optimization. However, our key observation is that LLMs often struggle with effective loop transformation optimization, frequently leading to errors or suboptimal optimization, thereby missing opportunities for performance improvements. To bridge this gap, we propose LOOPRAG, a novel retrieval-augmented generation framework designed to guide LLMs in performing effective loop optimization on Static Control Part. We introduce a parameter-driven method to harness loop properties, which trigger various loop transformations, and generate diverse yet legal example codes serving as a demonstration source. To effectively obtain the most informative demonstrations, we propose a loop-aware algorithm based on loop features, which balances similarity and diversity for code retrieval. To enhance correct and efficient code generation, we introduce a feedback-based iterative mechanism that incorporates compilation, testing and performance results as feedback to guide LLMs. Each optimized code undergoes mutation, coverage and differential testing for equivalence checking. We evaluate LOOPRAG on PolyBench, TSVC and LORE benchmark suites, and compare it against compilers (GCC-Graphite, Clang-Polly, Perspective and ICX) and representative LLMs (DeepSeek and GPT-4). The results demonstrate average speedups over base compilers of up to 11.20$\times$, 14.34$\times$, and 9.29$\times$ for PolyBench, TSVC, and LORE, respectively, and speedups over base LLMs of up to 11.97$\times$, 5.61$\times$, and 11.59$\times$.
Authors' comments: Accepted to ASPLOS 2026

Vote

Add to Library

Recommend

8250. Beyond Pixels: A Training-Free, Text-to-Text Framework for Remote Sensing Image Retrieval

J. Xiao, Y. Guo, X. Zi, K. Thiyagarajan, C. Moreira, M. Prasad

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10596v1

Vote

Add to Library

Recommend

8251. Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers

Youmin Ko, Sungjong Seo, Hyunjoon Kim

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10422v1

Vote

Add to Library

Recommend

8252. Point to Span: Zero-Shot Moment Retrieval for Navigating Unseen Hour-Long Videos

Mingyu Jeon, Jisoo Yang, Sungjin Han, Jinkwon Hwang, Sunjae Yoon, Jonghee Kim, Junyeoung Kim

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10363v1

Vote

Add to Library

Recommend

8253. MedXAI: A Retrieval-Augmented and Self-Verifying Framework for Knowledge-Guided Medical Image Analysis

Midhat Urooj, Ayan Banerjee, Farhat Shaikh, Kuntal Thakur, Sandeep Gupta

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10098v1

Vote

Add to Library

Recommend

8254. MedBioRAG: Semantic Search and Retrieval-Augmented Generation with Large Language Models for Medical and Biological QA

Seonok Kim

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10996v1

Vote

Add to Library

Recommend

8255. RouteRAG: Efficient Retrieval-Augmented Generation from Text and Graph via Reinforcement Learning

Yucan Guo, Miao Su, Saiping Guan, Zihao Sun, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.09487v1

Vote

Add to Library

Recommend

8256. Detecting Hallucinations in Graph Retrieval-Augmented Generation via Attention Patterns and Semantic Alignment

Shanghao Li, Jinda Han, Yibo Wang, Yuanjie Zhu, Zihe Song, Langzhou He, Kenan Kamel A Alghythee, Philip S. Yu

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.09148v1

Vote

Add to Library

Recommend

8257. RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge

Ahmed Bin Khalid

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2602.22217v1

Vote

Add to Library

Recommend

8258. MIRAGE: Misleading Retrieval-Augmented Generation via Black-box and Query-agnostic Poisoning Attacks

Tailun Chen, Yu He, Yan Wang, Shuo Shao, Haolun Zheng, Zhihao Liu, Jinfeng Li, Yuefeng Chen et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.08289v1

Vote

Add to Library

Recommend

8259. On the existence of large subspaces of $C(K)$ that perform stable phase retrieval

Enrique García-Sánchez, David de Hevia, Mitchell Taylor

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.08114v1

Vote

Add to Library

Recommend

8260. PoultryTalk: A Multi-modal Retrieval-Augmented Generation (RAG) System for Intelligent Poultry Management and Decision Support

Kapalik Khanal, Biswash Khatiwada, Stephen Afrifa, Ranjan Sapkota, Sanjay Shah, Frank Bai, Ramesh Bahadur Bist

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.08995v1

Vote

Add to Library

Recommend

Benty-search

8241. A Simple and Effective Framework for Symmetric Consistent Indexing in Large-Scale Dense Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.13074v1

8242. SPAR: Session-based Pipeline for Adaptive Retrieval on Legacy File Systems

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.12938v1

8243. SignRAG: A Retrieval-Augmented System for Scalable Zero-Shot Road Sign Recognition

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.12885v1

8244. Breaking the Curse of Dimensionality: On the Stability of Modern Vector Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.12458v1

8245. V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.12284v1

8246. Citation-Grounded Code Comprehension: Preventing LLM Hallucination Through Hybrid Retrieval and Graph-Augmented Context

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.12117v1

8247. FloodSQL-Bench: A Retrieval-Augmented Benchmark for Geospatially-Grounded Text-to-SQL

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.12084v1

8248. Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.11690v1

8249. LOOPRAG: Enhancing Loop Transformation Optimization with Retrieval-Augmented Large Language Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.15766v1

8250. Beyond Pixels: A Training-Free, Text-to-Text Framework for Remote Sensing Image Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.10596v1

8251. Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.10422v1

8252. Point to Span: Zero-Shot Moment Retrieval for Navigating Unseen Hour-Long Videos

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.10363v1

8253. MedXAI: A Retrieval-Augmented and Self-Verifying Framework for Knowledge-Guided Medical Image Analysis

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.10098v1

8254. MedBioRAG: Semantic Search and Retrieval-Augmented Generation with Large Language Models for Medical and Biological QA

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.10996v1

8255. RouteRAG: Efficient Retrieval-Augmented Generation from Text and Graph via Reinforcement Learning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.09487v1

8256. Detecting Hallucinations in Graph Retrieval-Augmented Generation via Attention Patterns and Semantic Alignment

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.09148v1

8257. RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2602.22217v1

8258. MIRAGE: Misleading Retrieval-Augmented Generation via Black-box and Query-agnostic Poisoning Attacks

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.08289v1

8259. On the existence of large subspaces of $C(K)$ that perform stable phase retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.08114v1

8260. PoultryTalk: A Multi-modal Retrieval-Augmented Generation (RAG) System for Intelligent Poultry Management and Decision Support

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.08995v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.13074v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12938v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12885v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12458v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12284v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12117v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.12084v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.11690v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.15766v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10596v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10422v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10363v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10098v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.10996v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.09487v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.09148v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2602.22217v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.08289v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.08114v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.08995v1