benty-fields - Search paper

5021. DualResearch: Entropy-Gated Dual-Graph Retrieval for Answer Reconstruction

Jinxin Shi, Zongsheng Cao, Runmin Ma, Yusong Hu, Jie Zhou, Xin Li, Lei Bai, Liang He et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08959v1

5022. Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval

Kostiantyn Bevziuk, Andrii Fatula, Svetozar Lashin Yaroslav Opanasenko, Anna Tukhtarova, Ashok Jallepalli Pradeepkumar Sharma, Hritvik Shrivastava

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08876v1

Vote

Add to Library

Recommend

5023. Repository-Aware File Path Retrieval via Fine-Tuned LLMs

Vasudha Yanuganti, Ishaan Puri, Swapnil Chhatre, Mantinder Singh, Ashok Jallepalli, Hritvik Shrivastava, Pradeep Kumar Sharma

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08850v1

Vote

Add to Library

Recommend

5024. ReasonEmbed: Enhanced Text Embeddings for Reasoning-Intensive Document Retrieval

Jianlyu Chen, Junwei Lan, Chaofan Li, Defu Lian, Zheng Liu

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08252v1

Vote

Add to Library

Recommend

5025. VersionRAG: Version-Aware Retrieval-Augmented Generation for Evolving Documents

Daniel Huwiler, Kurt Stockinger, Jonathan Fürst

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08109v1

Vote

Add to Library

Recommend

5026. Multilingual Generative Retrieval via Cross-lingual Semantic Compression

Yuxin Huang, Simeng Wu, Ran Song, Yan Xiang, Yantuan Xian, Shengxiang Gao, Zhengtao Yu

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.07812v1

Vote

Add to Library

Recommend

5027. HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation

Peilin Wu, Mian Zhang, Kun Wan, Wentian Zhao, Kaiyu He, Xinya Du, Zhiyu Chen

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.07794v1

Agentic RAG is a powerful technique for incorporating external information that LLMs lack, enabling better problem solving and question answering. However, suboptimal search behaviors exist widely, such as over-search (retrieving information already known) and under-search (failing to search when necessary), which leads to unnecessary overhead and unreliable outputs. Current training methods, which typically rely on outcome-based rewards in a RL framework, lack the fine-grained control needed to address these inefficiencies. To overcome this, we introduce Hierarchical Process Rewards for Efficient agentic RAG (HiPRAG), a training methodology that incorporates a fine-grained, knowledge-grounded process reward into the RL training. Our approach evaluates the necessity of each search decision on-the-fly by decomposing the agent's reasoning trajectory into discrete, parsable steps. We then apply a hierarchical reward function that provides an additional bonus based on the proportion of optimal search and non-search steps, on top of commonly used outcome and format rewards. Experiments on the Qwen2.5 and Llama-3.2 models across seven diverse QA benchmarks show that our method achieves average accuracies of 65.4% (3B) and 67.2% (7B). This is accomplished while improving search efficiency, reducing the over-search rate to just 2.3% and concurrently lowering the under-search rate. These results demonstrate the efficacy of optimizing the reasoning process itself, not just the final outcome. Further experiments and analysis demonstrate that HiPRAG shows good generalizability across a wide range of RL algorithms, model families, sizes, and types. This work demonstrates the importance and potential of fine-grained control through RL, for improving the efficiency and optimality of reasoning for search agents.
Authors' comments: Under review

Vote

Add to Library

Recommend

5028. Towards Reliable Retrieval in RAG Systems for Large Legal Datasets

Markus Reuter, Tobias Lingenberg, Rūta Liepiņa, Francesca Lagioia, Marco Lippi, Giovanni Sartor, Andrea Passerini, Burcu Sayin

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06999v1

Vote

Add to Library

Recommend

5029. Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)

Junki Mori, Kazuya Kakizaki, Taiki Miyagawa, Jun Sakuma

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06719v1

Vote

Add to Library

Recommend

5030. LogSTOP: Temporal Scores over Prediction Sequences for Matching and Retrieval

Avishree Khare, Hideki Okamoto, Bardh Hoxha, Georgios Fainekos, Rajeev Alur

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06512v1

Vote

Add to Library

Recommend

5031. Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context

Yoav Gur-Arieh, Mor Geva, Atticus Geiger

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06182v1

Vote

Add to Library

Recommend

5032. YpathRAG:A Retrieval-Augmented Generation Framework and Benchmark for Pathology

Deshui Yu, Yizhi Wang, Saihui Jin, Taojie Zhu, Fanyi Zeng, Wen Qian, Zirui Huang, Jingli Ouyang et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08603v1

Vote

Add to Library

Recommend

5033. Personalizing Retrieval using Joint Embeddings or "the Return of Fluffy"

Bruno Korbar, Andrew Zisserman

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05411v1

Vote

Add to Library

Recommend

5034. Context Length Alone Hurts LLM Performance Despite Perfect Retrieval

Yufeng Du, Minyang Tian, Srikanth Ronanki, Subendhu Rongali, Sravan Bodapati, Aram Galstyan, Azton Wells, Roy Schwartz et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05381v1

Vote

Add to Library

Recommend

5035. Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization

Omri Uzan, Asaf Yehudai, Roi pony, Eyal Shnarch, Ariel Gera

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05038v1

Vote

Add to Library

Recommend

5036. WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives

Yongan Yu, Xianda Du, Qingchen Hu, Jiahao Liang, Jingwei Ni, Dan Qiang, Kaiyu Huang, Grant McKenzie et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05336v1

Vote

Add to Library

Recommend

5037. Improving Consistency in Retrieval-Augmented Systems with Group Similarity Rewards

Faisal Hamman, Chenyang Zhu, Anoop Kumar, Xujun Peng, Sanghamitra Dutta, Daben Liu, Alfy Samuel

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.04392v1

Vote

Add to Library

Recommend

5038. Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness

Lingnan Xu, Chong Feng, Kaiyuan Zhang, Liu Zhengyong, Wenqiang Xu, Fanqing Meng

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.04293v1

Vote

Add to Library

Recommend

5039. Investigating LLM Variability in Personalized Conversational Information Retrieval

Simon Lupart, Daniël van Dijk, Eric Langezaal, Ian van Dort, Mohammad Aliannejadi

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.03795v1

Personalized Conversational Information Retrieval (CIR) has seen rapid progress in recent years, driven by the development of Large Language Models (LLMs). Personalized CIR aims to enhance document retrieval by leveraging user-specific information, such as preferences, knowledge, or constraints, to tailor responses to individual needs. A key resource for this task is the TREC iKAT 2023 dataset, designed to evaluate personalization in CIR pipelines. Building on this resource, Mo et al. explored several strategies for incorporating Personal Textual Knowledge Bases (PTKB) into LLM-based query reformulation. Their findings suggested that personalization from PTKBs could be detrimental and that human annotations were often noisy. However, these conclusions were based on single-run experiments using the GPT-3.5 Turbo model, raising concerns about output variability and repeatability. In this reproducibility study, we rigorously reproduce and extend their work, focusing on LLM output variability and model generalization. We apply the original methods to the new TREC iKAT 2024 dataset and evaluate a diverse range of models, including Llama (1B-70B), Qwen-7B, GPT-4o-mini. Our results show that human-selected PTKBs consistently enhance retrieval performance, while LLM-based selection methods do not reliably outperform manual choices. We further compare variance across datasets and observe higher variability on iKAT than on CAsT, highlighting the challenges of evaluating personalized CIR. Notably, recall-oriented metrics exhibit lower variance than precision-oriented ones, a critical insight for first-stage retrievers. Finally, we underscore the need for multi-run evaluations and variance reporting when assessing LLM-based CIR systems. By broadening evaluation across models, datasets, and metrics, our study contributes to more robust and generalizable practices for personalized CIR.
Authors' comments: 11 pages, 5 figures, SIGIR-AP'25 Proceedings of the 2025 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region (SIGIR-AP 2025), December 7--10, 2025, Xi'an, China

Vote

Add to Library

Recommend

5040. SEER: The Span-based Emotion Evidence Retrieval Benchmark

Aneesha Sampath, Oya Aran, Emily Mower Provost

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.03490v1

Vote

Add to Library

Recommend

Benty-search

5021. DualResearch: Entropy-Gated Dual-Graph Retrieval for Answer Reconstruction

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.08959v1

5022. Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.08876v1

5023. Repository-Aware File Path Retrieval via Fine-Tuned LLMs

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.08850v1

5024. ReasonEmbed: Enhanced Text Embeddings for Reasoning-Intensive Document Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.08252v1

5025. VersionRAG: Version-Aware Retrieval-Augmented Generation for Evolving Documents

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.08109v1

5026. Multilingual Generative Retrieval via Cross-lingual Semantic Compression

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.07812v1

5027. HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.07794v1

5028. Towards Reliable Retrieval in RAG Systems for Large Legal Datasets

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.06999v1

5029. Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.06719v1

5030. LogSTOP: Temporal Scores over Prediction Sequences for Matching and Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.06512v1

5031. Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.06182v1

5032. YpathRAG:A Retrieval-Augmented Generation Framework and Benchmark for Pathology

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.08603v1

5033. Personalizing Retrieval using Joint Embeddings or "the Return of Fluffy"

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.05411v1

5034. Context Length Alone Hurts LLM Performance Despite Perfect Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.05381v1

5035. Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.05038v1

5036. WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.05336v1

5037. Improving Consistency in Retrieval-Augmented Systems with Group Similarity Rewards

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.04392v1

5038. Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.04293v1

5039. Investigating LLM Variability in Personalized Conversational Information Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.03795v1

5040. SEER: The Span-based Emotion Evidence Retrieval Benchmark

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2510.03490v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08959v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08876v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08850v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08252v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08109v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.07812v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.07794v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06999v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06719v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06512v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.06182v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.08603v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05411v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05381v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05038v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.05336v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.04392v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.04293v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.03795v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2510.03490v1