benty-fields - Search paper

8201. Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Qizhi Wang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2602.23372v1

Vote

Add to Library

Recommend

8202. Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning

YuXiang Kong, JunFeng Hou, Jian Tang, Bingqing Zhu, Jicheng Zhang, Shaofei Xue

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.21828v1

Vote

Add to Library

Recommend

8203. A Knowledge Graph and Deep Learning-Based Semantic Recommendation Database System for Advertisement Retrieval and Personalization

Tangtang Wang, Kaijie Zhang, Kuangcong Liu

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2601.00833v1

Vote

Add to Library

Recommend

8204. C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling

Jin Qin, Zihan Liao, Ziyin Zhang, Hang Yu, Peng Di, Rui Wang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.21332v1

Vote

Add to Library

Recommend

8205. MultiMind at SemEval-2025 Task 7: Crosslingual Fact-Checked Claim Retrieval via Multi-Source Alignment

Mohammad Mahdi Abootorabi, Alireza Ghahramani Kure, Mohammadali Mohammadkhani, Sina Elahimanesh, Mohammad Ali Ali Panah

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20950v1

Vote

Add to Library

Recommend

8206. MMSRARec: Summarization and Retrieval Augumented Sequential Recommendation Based on Multimodal Large Language Model

Haoyu Wang, Yitong Wang, Jining Wang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20916v1

Recent advancements in Multimodal Large Language Models (MLLMs) have demonstrated significant potential in recommendation systems. However, the effective application of MLLMs to multimodal sequential recommendation remains unexplored: A) Existing methods primarily leverage the multimodal semantic understanding capabilities of pre-trained MLLMs to generate item embeddings or semantic IDs, thereby enhancing traditional recommendation models. These approaches generate item representations that exhibit limited interpretability, and pose challenges when transferring to language model-based recommendation systems. B) Other approaches convert user behavior sequence into image-text pairs and perform recommendation through multiple MLLM inference, incurring prohibitive computational and time costs. C) Current MLLM-based recommendation systems generally neglect the integration of collaborative signals. To address these limitations while balancing recommendation performance, interpretability, and computational cost, this paper proposes MultiModal Summarization-and-Retrieval-Augmented Sequential Recommendation. Specifically, we first employ MLLM to summarize items into concise keywords and fine-tune the model using rewards that incorporate summary length, information loss, and reconstruction difficulty, thereby enabling adaptive adjustment of the summarization policy. Inspired by retrieval-augmented generation, we then transform collaborative signals into corresponding keywords and integrate them as supplementary context. Finally, we apply supervised fine-tuning with multi-task learning to align the MLLM with the multimodal sequential recommendation. Extensive evaluations on common recommendation datasets demonstrate the effectiveness of MMSRARec, showcasing its capability to efficiently and interpretably understand user behavior histories and item information for accurate recommendations.
Authors' comments: Under Review

Vote

Add to Library

Recommend

8207. ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction

Md Zabirul Islam, Md Motaleb Hossen Manik, Ge Wang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20858v1

Vote

Add to Library

Recommend

8208. Soft Filtering: Guiding Zero-shot Composed Image Retrieval with Prescriptive and Proscriptive Constraints

Youjin Jung, Seongwoo Cho, Hyun-seok Min, Sungchul Choi

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20781v1

Vote

Add to Library

Recommend

8209. Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark

Hao Guo, Xugong Qin, Jun Jie Ou Yang, Peng Zhang, Gangyan Zeng, Yubo Li, Hailun Lin

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20174v1

Vote

Add to Library

Recommend

8210. M$^3$KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation

Hyeongcheol Park, Jiyoung Seo, Jaewon Mun, Hogun Park, Wonmin Byeon, Sung June Kim, Hyeonsoo Im, JeungSub Lee et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20136v1

Vote

Add to Library

Recommend

8211. A Multi-Agent Retrieval-Augmented Framework for Work-in-Progress Predictio

Yousef Mehrdad Bibalan, Behrouz Far, Mohammad Moshirpour, Bahareh Ghiyasian

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19841v1

Vote

Add to Library

Recommend

8212. MaP-AVR: A Meta-Action Planner for Agents Leveraging Vision Language Models and Retrieval-Augmented Generation

Zhenglong Guo, Yiming Zhao, Feng Jiang, Heng Jin, Zongbao Feng, Jianbin Zhou, Siyuan Xu

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19453v1

Embodied robotic AI systems designed to manage complex daily tasks rely on a task planner to understand and decompose high-level tasks. While most research focuses on enhancing the task-understanding abilities of LLMs/VLMs through fine-tuning or chain-of-thought prompting, this paper argues that defining the planned skill set is equally crucial. To handle the complexity of daily environments, the skill set should possess a high degree of generalization ability. Empirically, more abstract expressions tend to be more generalizable. Therefore, we propose to abstract the planned result as a set of meta-actions. Each meta-action comprises three components: {move/rotate, end-effector status change, relationship with the environment}. This abstraction replaces human-centric concepts, such as grasping or pushing, with the robot's intrinsic functionalities. As a result, the planned outcomes align seamlessly with the complete range of actions that the robot is capable of performing. Furthermore, to ensure that the LLM/VLM accurately produces the desired meta-action format, we employ the Retrieval-Augmented Generation (RAG) technique, which leverages a database of human-annotated planning demonstrations to facilitate in-context learning. As the system successfully completes more tasks, the database will self-augment to continue supporting diversity. The meta-action set and its integration with RAG are two novel contributions of our planner, denoted as MaP-AVR, the meta-action planner for agents composed of VLM and RAG. To validate its efficacy, we design experiments using GPT-4o as the pre-trained LLM/VLM model and OmniGibson as our robotic platform. Our approach demonstrates promising performance compared to the current state-of-the-art method. Project page: https://map-avr.github.io/.
Authors' comments: 8 pages, 10 figures, This work was completed in December 2024

Vote

Add to Library

Recommend

8213. From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions

Jiaren Peng, Hongda Sun, Xuan Tian, Cheng Huang, Zeqing Li, Rui Yan

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19414v1

Vote

Add to Library

Recommend

8214. Scalable and Reliable Evaluation of AI Knowledge Retrieval Systems: RIKER and the Coherent Simulated Universe

JV Roig

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2601.08847v1

Vote

Add to Library

Recommend

8215. QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Dehai Min, Kailin Zhang, Tongtong Wu, Lu Cheng

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19134v1

Vote

Add to Library

Recommend

8216. Retrieving Objects from 3D Scenes with Box-Guided Open-Vocabulary Instance Segmentation

Khanh Nguyen, Dasith de Silva Edirimuni, Ghulam Mubashar Hassan, Ajmal Mian

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19088v1

Vote

Add to Library

Recommend

8217. Affordance RAG: Hierarchical Multimodal Retrieval with Affordance-Aware Embodied Memory for Mobile Manipulation

Ryosuke Korekata, Quanting Xie, Yonatan Bisk, Komei Sugiura

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.18987v1

Vote

Add to Library

Recommend

8218. Directional Attractors in LLM Reasoning: How Similarity Retrieval Steers Iterative Summarization Based Reasoning

Cagatay Tekin, Charbel Barakat, Luis Joseph Luna Limgenco

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2601.08846v1

Vote

Add to Library

Recommend

8219. CIRR: Causal-Invariant Retrieval-Augmented Recommendation with Faithful Explanations under Distribution Shift

Sebastian Sun

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.18683v1

Vote

Add to Library

Recommend

8220. PMPGuard: Catching Pseudo-Matched Pairs in Remote Sensing Image-Text Retrieval

Pengxiang Ouyang, Qing Ma, Zheng Wang, Cong Bai

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.18660v1

Vote

Add to Library

Recommend

Benty-search

8201. Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2602.23372v1

8202. Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.21828v1

8203. A Knowledge Graph and Deep Learning-Based Semantic Recommendation Database System for Advertisement Retrieval and Personalization

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2601.00833v1

8204. C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.21332v1

8205. MultiMind at SemEval-2025 Task 7: Crosslingual Fact-Checked Claim Retrieval via Multi-Source Alignment

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.20950v1

8206. MMSRARec: Summarization and Retrieval Augumented Sequential Recommendation Based on Multimodal Large Language Model

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.20916v1

8207. ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.20858v1

8208. Soft Filtering: Guiding Zero-shot Composed Image Retrieval with Prescriptive and Proscriptive Constraints

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.20781v1

8209. Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.20174v1

8210. M$^3$KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.20136v1

8211. A Multi-Agent Retrieval-Augmented Framework for Work-in-Progress Predictio

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.19841v1

8212. MaP-AVR: A Meta-Action Planner for Agents Leveraging Vision Language Models and Retrieval-Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.19453v1

8213. From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.19414v1

8214. Scalable and Reliable Evaluation of AI Knowledge Retrieval Systems: RIKER and the Coherent Simulated Universe

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2601.08847v1

8215. QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.19134v1

8216. Retrieving Objects from 3D Scenes with Box-Guided Open-Vocabulary Instance Segmentation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.19088v1

8217. Affordance RAG: Hierarchical Multimodal Retrieval with Affordance-Aware Embodied Memory for Mobile Manipulation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.18987v1

8218. Directional Attractors in LLM Reasoning: How Similarity Retrieval Steers Iterative Summarization Based Reasoning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2601.08846v1

8219. CIRR: Causal-Invariant Retrieval-Augmented Recommendation with Faithful Explanations under Distribution Shift

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.18683v1

8220. PMPGuard: Catching Pseudo-Matched Pairs in Remote Sensing Image-Text Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2512.18660v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2602.23372v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.21828v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2601.00833v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.21332v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20950v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20916v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20858v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20781v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20174v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.20136v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19841v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19453v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19414v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2601.08847v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19134v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.19088v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.18987v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2601.08846v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.18683v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2512.18660v1