benty-fields - Search paper

6101. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.15626v2

Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically. It contains fine-grained annotations of premises in proofs, providing valuable data for premise selection: a key bottleneck in theorem proving. Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library. It is inexpensive and needs only one GPU week of training. Our retriever leverages LeanDojo's program analysis capability to identify accessible premises and hard negative examples, which makes retrieval much more effective. Furthermore, we construct a new benchmark consisting of 98,734 theorems and proofs extracted from Lean's math library. It features challenging data split requiring the prover to generalize to theorems relying on novel premises that are never used in training. We use this benchmark for training and evaluation, and experimental results demonstrate the effectiveness of ReProver over non-retrieval baselines and GPT-4. We thus provide the first set of open-source LLM-based theorem provers without any proprietary datasets and release it under a permissive MIT license to facilitate further research.
Authors' comments: Accepted to NeurIPS 2023 (Datasets and Benchmarks Track) as an oral presentation. Data, code, and models available at https://leandojo.org/

Benty-search

6101. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.15626v2

6102. Hierarchical Matching and Reasoning for Multi-Query Image Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.14460v1

6103. Retrieval of Boost Invariant Symbolic Observables via Feature Importance

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.13496v1

6104. Feature Mixing for Writer Retrieval and Identification on Papyri Fragments

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.12939v1

6105. Relationships between the Phase Retrieval Problem and Permutation Invariant Embeddings

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.13111v1

6106. Resources and Evaluations for Multi-Distribution Dense Information Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.12601v1

6107. Explainable Recommendation with Personalized Review Retrieval and Aspect Learning

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.12657v1

6108. Annotation Cost Efficient Active Learning for Content Based Image Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.11605v2

6109. Exoplanet Interior Retrievals: core masses and metallicities from atmospheric abundances

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.11354v1

6110. Multilingual Few-Shot Learning via Language Model Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.10964v1

6111. Deep Reinforcement Learning with Task-Adaptive Retrieval via Hypernetwork

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.10698v5

6112. Universal Information Extraction with Meta-Pretrained Self-Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.10444v1

6113. Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.09820v2

6114. Graph Convolution Based Efficient Re-Ranking for Visual Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.08792v1

6115. RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.05212v1

6116. MarineVRS: Marine Video Retrieval System with Explainability via Semantic Understanding

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.04593v1

6117. An Overview of Challenges in Egocentric Text-Video Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.04345v1

6118. Automatic retrieval of corresponding US views in longitudinal examinations

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.04739v1

6119. Unified Embedding Based Personalized Retrieval in Etsy Search

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.04833v2

6120. Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2306.03166v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.15626v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.14460v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.13496v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.12939v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.13111v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.12601v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.12657v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.11605v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.11354v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.10964v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.10698v5

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.10444v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.09820v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.08792v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.05212v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.04593v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.04345v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.04739v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.04833v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2306.03166v1