benty-fields - Search paper

Retrieval-Augmented Generation (RAG) has emerged as an effective paradigm for generating contextually accurate answers by integrating Large Language Models (LLMs) with retrieval mechanisms. However, in legal contexts, users frequently reference norms by their labels or nicknames (e.g., Article 5 of the Constitution or Consumer Defense Code (CDC)), rather than by their content, posing challenges for traditional RAG approaches that rely solely on semantic embeddings of text. Furthermore, legal texts themselves heavily rely on explicit cross-references (e.g., "pursuant to Article 34") that function as pointers. Both scenarios pose challenges for traditional RAG approaches that rely solely on semantic embeddings of text, often failing to retrieve the necessary referenced content. This paper introduces Poly-Vector Retrieval, a method assigning multiple distinct embeddings to each legal provision: one embedding captures the content (the full text), another captures the label (the identifier or proper name), and optionally additional embeddings capture alternative denominations. Inspired by Frege's distinction between Sense and Reference, this poly-vector retrieval approach treats labels, identifiers and reference markers as rigid designators and content embeddings as carriers of semantic substance. Experiments on the Brazilian Federal Constitution demonstrate that Poly-Vector Retrieval significantly improves retrieval accuracy for label-centric queries and potential to resolve internal and external cross-references, without compromising performance on purely semantic queries. The study discusses philosophical and practical implications of explicitly separating reference from content in vector embeddings and proposes future research directions for applying this approach to broader legal datasets and other domains characterized by explicit reference identifiers.
Authors' comments: 39 pages, 5 figures

Vote

Add to Library

Recommend

5400. A characterization of complex stable phase retrieval in Banach lattices

Manuel Camúñez, Enrique García-Sánchez, David de Hevia

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.06693v1

Vote

Add to Library

Recommend

Benty-search

5381. Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.14011v1

5382. FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.13128v2

5383. Towards Lossless Token Pruning in Late-Interaction Retrieval Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.12778v1

5384. Aspect-Based Summarization with Self-Aspect Retrieval Enhanced Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.13054v1

5385. Building Russian Benchmark for Evaluation of Information Retrieval Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.12879v1

5386. ACoRN: Noise-Robust Abstractive Compression in Retrieval-Augmented Language Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.12673v1

5387. CDF-RAG: Causal Dynamic Feedback for Adaptive Retrieval-Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.12560v1

5388. Efficient Distributed Retrieval-Augmented Generation for Enhancing Language Model Performance

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.11197v1

5389. Towards Efficient Partially Relevant Video Retrieval with Active Moment Discovering

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.10920v1

5390. MAGPIE: Multilevel-Adaptive-Guided Solver for Ptychographic Phase Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.10118v3

5391. CSPLADE: Learned Sparse Retrieval with Causal Language Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.10816v2

5392. Optimal sparse phase retrieval via a quasi-Bayesian approach

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.09509v1

5393. HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.12330v1

5394. Knowledge Graph-extended Retrieval Augmented Generation for Question Answering

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.08893v1

5395. A Reproducibility Study of Graph-Based Legal Case Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.08400v1

5396. PCA-RAG: Principal Component Analysis for Efficient Retrieval-Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.08386v1

5397. Plan-and-Refine: Diverse and Comprehensive Retrieval-Augmented Generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.07794v1

5398. REANIMATOR: Reanimate Retrieval Test Collections with Extracted and Synthetic Resources

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.07584v1

5399. Poly-Vector Retrieval: Reference and Content Embeddings for Legal Documents

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.10508v1

5400. A characterization of complex stable phase retrieval in Banach lattices

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2504.06693v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.14011v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.13128v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.12778v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.13054v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.12879v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.12673v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.12560v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.11197v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.10920v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.10118v3

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.10816v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.09509v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.12330v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.08893v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.08400v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.08386v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.07794v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.07584v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.10508v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2504.06693v1