benty-fields - Search paper

Emily Calamari, Jacqueline K. Faherty, Ben Burningham, Eileen Gonzales, Daniella Bardalez-Gagliuffi, Johanna M. Vos, Marina Gemma, Niall Whiteford et al.

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.13614v1

Vote

Add to Library

Recommend

6316. Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval

Minjoon Jung, Seongho Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12617v1

Vote

Add to Library

Recommend

6317. ADMM based Fourier phase retrieval with untrained generative prior

Liyuan Ma, Hongxia Wang, Ningyi Leng, Ziyang Yuan

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12646v1

Vote

Add to Library

Recommend

6318. Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

Wenhao Yu, Chenguang Zhu, Zhihan Zhang, Shuohang Wang, Zhuosheng Zhang, Yuwei Fang, Meng Jiang

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12887v1

Vote

Add to Library

Recommend

6319. Decoding a Neural Retriever's Latent Space for Query Suggestion

Leonard Adolphs, Michelle Chen Huebscher, Christian Buck, Sertan Girgin, Olivier Bachem, Massimiliano Ciaramita, Thomas Hofmann

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12084v1

Vote

Add to Library

Recommend

6320. Dissecting Deep Metric Learning Losses for Image-Text Retrieval

Hong Xuan, Xi Chen

WACV2023

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.13188v1

Visual-Semantic Embedding (VSE) is a prevalent approach in image-text retrieval by learning a joint embedding space between the image and language modalities where semantic similarities would be preserved. The triplet loss with hard-negative mining has become the de-facto objective for most VSE methods. Inspired by recent progress in deep metric learning (DML) in the image domain which gives rise to new loss functions that outperform triplet loss, in this paper, we revisit the problem of finding better objectives for VSE in image-text matching. Despite some attempts in designing losses based on gradient movement, most DML losses are defined empirically in the embedding space. Instead of directly applying these loss functions which may lead to sub-optimal gradient updates in model parameters, in this paper we present a novel Gradient-based Objective AnaLysis framework, or \textit{GOAL}, to systematically analyze the combinations and reweighting of the gradients in existing DML functions. With the help of this analysis framework, we further propose a new family of objectives in the gradient space exploring different gradient combinations. In the event that the gradients are not integrable to a valid loss function, we implement our proposed objectives such that they would directly operate in the gradient space instead of on the losses in the embedding space. Comprehensive experiments have demonstrated that our novel objectives have consistently improved performance over baselines across different visual/text features and model frameworks. We also showed the generalizability of the GOAL framework by extending it to other models using triplet family losses including vision-language model with heavy cross-modal interactions and have achieved state-of-the-art results on the image-text retrieval tasks on COCO and Flick30K.
Authors' comments: arXiv admin note: text overlap with arXiv:2201.11307

Vote

Add to Library

Recommend

Benty-search

6301. On Negative Sampling for Contrastive Audio-Text Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2211.04070v2

6302. Retrieval augmentation of large language models for lay language generation

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2211.03818v2

6303. Zero-shot Video Moment Retrieval With Off-the-Shelf Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2211.02178v1

6304. Automatic Crater Shape Retrieval using Unsupervised and Semi-Supervised Systems

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2211.01933v1

6305. Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2211.00915v2

6306. Practical Phase Retrieval Using Double Deep Image Priors

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2211.00799v1

6307. Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.17167v1

6308. FedVMR: A New Federated Learning method for Video Moment Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.15977v1

6309. 3D Shape Knowledge Graph for Cross-domain 3D Shape Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.15136v2

6310. Deploying a Retrieval based Response Model for Task Oriented Dialogues

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.14379v1

6311. Phase Retrieval of Quaternion Signal via Wirtinger Flow

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.14170v3

6312. Online Information Retrieval Evaluation using the STELLA Framework

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.13202v1

6313. Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.13440v1

6314. Bridging the Training-Inference Gap for Dense Phrase Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.13678v1

6315. An Atmospheric Retrieval of the Brown Dwarf Gliese 229B

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.13614v1

6316. Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.12617v1

6317. ADMM based Fourier phase retrieval with untrained generative prior

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.12646v1

6318. Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.12887v1

6319. Decoding a Neural Retriever's Latent Space for Query Suggestion

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.12084v1

6320. Dissecting Deep Metric Learning Losses for Image-Text Retrieval

Show abstract | Show figures | Show BibTeX | Show discussion 0 | View PDF | 2210.13188v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2211.04070v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2211.03818v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2211.02178v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2211.01933v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2211.00915v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2211.00799v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.17167v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.15977v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.15136v2

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.14379v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.14170v3

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.13202v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.13440v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.13678v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.13614v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12617v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12646v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12887v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.12084v1

Show abstract | Show figures | Show BibTeX | Show discussion | View PDF | 2210.13188v1