benty-fields - Search paper

Composed Image Retrieval (CoIR) has recently gained popularity as a task that considers both text and image queries together, to search for relevant images in a database. Most CoIR approaches require manually annotated datasets, comprising image-text-image triplets, where the text describes a modification from the query image to the target image. However, manual curation of CoIR triplets is expensive and prevents scalability. In this work, we instead propose a scalable automatic dataset creation methodology that generates triplets given video-caption pairs, while also expanding the scope of the task to include composed video retrieval (CoVR). To this end, we mine paired videos with a similar caption from a large database, and leverage a large language model to generate the corresponding modification text. Applying this methodology to the extensive WebVid2M collection, we automatically construct our WebVid-CoVR dataset, resulting in 1.6 million triplets. Moreover, we introduce a new benchmark for CoVR with a manually annotated evaluation set, along with baseline results. We further validate that our methodology is equally applicable to image-caption pairs, by generating 3.3 million CoIR training triplets using the Conceptual Captions dataset. Our model builds on BLIP-2 pretraining, adapting it to composed video (or image) retrieval, and incorporates an additional caption retrieval loss to exploit extra supervision beyond the triplet. We provide extensive ablations to analyze the design choices on our new CoVR benchmark. Our experiments also demonstrate that training a CoVR model on our datasets effectively transfers to CoIR, leading to improved state-of-the-art performance in the zero-shot setup on the CIRR, FashionIQ, and CIRCO benchmarks. Our code, datasets, and models are publicly available at https://imagine.enpc.fr/ ventural/covr.
Authors' comments: Appears in TPAMI 2024 (DOI: 10.1109/TPAMI.2024.3463799). Journal extension of the AAAI 2024 conference paper arXiv:2308.14746v3. Project page: https://imagine.enpc.fr/~ventural/covr/

Vote

Add to Library

Recommend

6046. Central Similarity Multi-View Hashing for Multimedia Retrieval

Jian Zhu, Wen Cheng, Yu Cui, Chang Tang, Yuyang Dai, Yong Li, Lingfang Zeng

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.13774v1

Vote

Add to Library

Recommend

6047. EditSum: A Retrieve-and-Edit Framework for Source Code Summarization

Jia Li, Yongmin Li, Ge Li, Xing Hu, Xin Xia, Zhi Jin

2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2021)

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.13775v2

Existing studies show that code summaries help developers understand and maintain source code. Unfortunately, these summaries are often missing or outdated in software projects. Code summarization aims to generate natural language descriptions automatically for source code. Code summaries are highly structured and have repetitive patterns. Besides the patternized words, a code summary also contains important keywords, which are the key to reflecting the functionality of the code. However, the state-of-the-art approaches perform poorly on predicting the keywords, which leads to the generated summaries suffering a loss in informativeness. To alleviate this problem, this paper proposes a novel retrieve-and-edit approach named EditSum for code summarization. Specifically, EditSum first retrieves a similar code snippet from a pre-defined corpus and treats its summary as a prototype summary to learn the pattern. Then, EditSum edits the prototype automatically to combine the pattern in the prototype with the semantic information of input code. Our motivation is that the retrieved prototype provides a good start-point for post-generation because the summaries of similar code snippets often have the same pattern. The post-editing process further reuses the patternized words in the prototype and generates keywords based on the semantic information of input code. We conduct experiments on a large-scale Java corpus and experimental results demonstrate that EditSum outperforms the state-of-the-art approaches by a substantial margin. The human evaluation also proves the summaries generated by EditSum are more informative and useful. We also verify that EditSum performs well on predicting the patternized words and keywords.
Authors' comments: Accepted by the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE 2021)

Vote

Add to Library

Recommend

6048. Learning Efficient Representations for Image-Based Patent Retrieval

Hongsong Wang, Yuqi Zhang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.13749v1

Vote

Add to Library

Recommend

6049. Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval

Yuan Yuan, Yang Zhan, Zhitong Xiong

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.12509v1

Vote

Add to Library

Recommend

6050. Private Information Retrieval with Private Noisy Side Information

Hassan ZivariFard, Remi A. Chou

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.12374v2

Vote

Add to Library

Recommend

6051. Quantum Symmetric Private Information Retrieval with Secure Storage and Eavesdroppers

Alptug Aytekin, Mohamed Nomeir, Sajani Vithana, Sennur Ulukus

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10883v1

Vote

Add to Library

Recommend

6052. FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory

Anwesan Pal, Sahil Wadhwa, Ayush Jaiswal, Xu Zhang, Yue Wu, Rakesh Chada, Pradeep Natarajan, Henrik I. Christensen

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10170v1

Vote

Add to Library

Recommend

6053. Simple Baselines for Interactive Video Retrieval with Questions and Answers

Kaiqu Liang, Samuel Albanie

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10402v1

Vote

Add to Library

Recommend

6054. ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval

Kaihang Pan, Juncheng Li, Hongye Song, Hao Fei, Wei Ji, Shuo Zhang, Jun Lin, Xiaozhong Liu et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10025v1

Vote

Add to Library

Recommend

6055. Phase Retrieval with Background Information: Decreased References and Efficient Methods

Ziyang Yuan, Haoxing Yang, Ningyi Leng, Hongxia Wang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08328v1

Vote

Add to Library

Recommend

6056. Integrating Visual and Semantic Similarity Using Hierarchies for Image Retrieval

Aishwarya Venkataramanan, Martin Laviale, Cédric Pradalier

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08431v1

Vote

Add to Library

Recommend

6057. RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check

Siqi Song, Qi Lv, Lei Geng, Ziqiang Cao, Guohong Fu

NLPCC2023 oral

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08176v1

Vote

Add to Library

Recommend

6058. Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

Chaorui Deng, Qi Chen, Pengda Qin, Da Chen, Qi Wu

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.07648v1

Vote

Add to Library

Recommend

6059. Ranking-aware Uncertainty for Text-guided Image Retrieval

Junyang Chen, Hanjiang Lai

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08131v1

Vote

Add to Library

Recommend

6060. Large Language Models for Information Retrieval: A Survey

Yutao Zhu, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Haonan Chen, Zhicheng Dou et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.07107v3

Vote

Add to Library

Recommend

Benty-search

6041. Deep supervised hashing for fast retrieval of radio image cubes

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2309.00932v1

6042. Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2309.00661v1

6043. RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.16488v1

6044. Continual Learning for Generative Retrieval over Dynamic Corpora

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.14968v1

6045. CoVR-2: Automatic Data Construction for Composed Video Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.14746v2

6046. Central Similarity Multi-View Hashing for Multimedia Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.13774v1

6047. EditSum: A Retrieve-and-Edit Framework for Source Code Summarization

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.13775v2

6048. Learning Efficient Representations for Image-Based Patent Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.13749v1

6049. Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.12509v1

6050. Private Information Retrieval with Private Noisy Side Information

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.12374v2

6051. Quantum Symmetric Private Information Retrieval with Secure Storage and Eavesdroppers

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.10883v1

6052. FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.10170v1

6053. Simple Baselines for Interactive Video Retrieval with Questions and Answers

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.10402v1

6054. ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.10025v1

6055. Phase Retrieval with Background Information: Decreased References and Efficient Methods

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.08328v1

6056. Integrating Visual and Semantic Similarity Using Hierarchies for Image Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.08431v1

6057. RSpell: Retrieval-augmented Framework for Domain Adaptive Chinese Spelling Check

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.08176v1

6058. Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.07648v1

6059. Ranking-aware Uncertainty for Text-guided Image Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.08131v1

6060. Large Language Models for Information Retrieval: A Survey

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2308.07107v3

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2309.00932v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2309.00661v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.16488v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.14968v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.14746v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.13774v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.13775v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.13749v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.12509v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.12374v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10883v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10170v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10402v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.10025v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08328v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08431v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08176v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.07648v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.08131v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2308.07107v3