Haotian Li, Yong Wang, Aoyu Wu, Huan Wei, Huamin Qu
With the wide usage of data visualizations, a huge number of Scalable Vector
Graphic (SVG)-based visualizations have been created and shared online.
Accordingly, there has been an increasing interest in exploring how to retrieve
perceptually similar visualizations from a large corpus, since it can benefit
various downstream applications such as visualization recommendation. Existing
methods mainly focus on the visual appearance of visualizations by regarding
them as bitmap images. However, the structural information intrinsically
existing in SVG-based visualizations is ignored. Such structural information
can delineate the spatial and hierarchical relationship among visual elements,
and characterize visualizations thoroughly from a new perspective. This paper
presents a structure-aware method to advance the performance of visualization
retrieval by collectively considering both the visual and structural
information. We extensively evaluated our approach through quantitative
comparisons, a user study and case studies. The results demonstrate the
effectiveness of our approach and its advantages over existing methods.
Authors' comments: Accepted to the 2022 CHI Conference on Human Factors in Computing
Systems (CHI 22)
Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky
Retrieving target videos based on text descriptions is a task of great
practical value and has received increasing attention over the past few years.
Despite recent progress, imperfect annotations in existing video retrieval
datasets have posed significant challenges on model evaluation and development.
In this paper, we tackle this issue by focusing on the less-studied setting of
multi-query video retrieval, where multiple descriptions are provided to the
model for searching over the video archive. We first show that multi-query
retrieval task effectively mitigates the dataset noise introduced by imperfect
annotations and better correlates with human judgement on evaluating retrieval
abilities of current models. We then investigate several methods which leverage
multiple queries at training time, and demonstrate that the multi-query
inspired training can lead to superior performance and better generalization.
We hope further investigation in this direction can bring new insights on
building systems that perform better in real-world video retrieval
applications.
Authors' comments: ECCV 2022
Patrick Lewis, Barlas Oğuz, Wenhan Xiong, Fabio Petroni, Wen-tau Yih, Sebastian Riedel
We propose DrBoost, a dense retrieval ensemble inspired by boosting. DrBoost is trained in stages: each component model is learned sequentially and specialized by focusing only on retrieval mistakes made by the current ensemble. The final representation is the concatenation of the output vectors of all the component models, making it a drop-in replacement for standard dense retrievers at test time. DrBoost enjoys several advantages compared to standard dense retrieval models. It produces representations which are 4x more compact, while delivering comparable retrieval results. It also performs surprisingly well under approximate search with coarse quantization, reducing latency and bandwidth needs by another 4x. In practice, this can make the difference between serving indices from disk versus from memory, paving the way for much cheaper deployments.
Petra Galuščáková, Douglas W. Oard, Suraj Nair
Two key assumptions shape the usual view of ranked retrieval: (1) that the
searcher can choose words for their query that might appear in the documents
that they wish to see, and (2) that ranking retrieved documents will suffice
because the searcher will be able to recognize those which they wished to find.
When the documents to be searched are in a language not known by the searcher,
neither assumption is true. In such cases, Cross-Language Information Retrieval
(CLIR) is needed. This chapter reviews the state of the art for CLIR and
outlines some open research questions.
Authors' comments: 49 pages, 0 figures
Jonathan J. Fortney, Joanna K. Barstow, Nikku Madhusudhan
This brief review focuses on methods and applications of modeling
exoplanetary atmospheres. We discuss various kinds of state of the art
self-consistent and retrieval models in 1D and multi-D with a focus on open
questions and short- and long-term goals in the field. Expertise previously
developed in modeling cool stellar atmospheres and in modeling solar system
planetary atmospheres has proven valuable to the field, and will continue to do
so. We described upcoming opportunities for making progress in our
understanding of atmospheres, and close with what we see as the field's
challenges.
Authors' comments: Accepted in Autumn of 2020. To appear as a book chapter in
"ExoFrontiers: Big questions in exoplanetary science", Ed. N Madhusudhan
(Bristol: IOP Publishing Ltd) AAS-IOP ebooks
https://iopscience.iop.org/bookListInfo/aas-iop-astronomy
Daniil Mamaev
A smooth function f in a neighbourhood of the unit sphere $S^{n - 1}$ is said
to admit index $\lambda$ if it can be extended to a function F in the unit ball
$B^n$ such that F has a unique critical point p and the Morse index of p is
equal to $\lambda$. It is easy to see that a function f cannot admit two
indices of different parity. We prove that for any two indices that differ by
two there exists a function f that admits both of them.
Authors' comments: 15 pages, 3 figures
Tor Lattimore, Botao Hao
We study a bandit version of phase retrieval where the learner chooses actions $(A_t)_{t=1}^n$ in the $d$-dimensional unit ball and the expected reward is $\langle A_t, \theta_\star\rangle^2$ where $\theta_\star \in \mathbb R^d$ is an unknown parameter vector. We prove that the minimax cumulative regret in this problem is $\smash{\tilde \Theta(d \sqrt{n})}$, which improves on the best known bounds by a factor of $\smash{\sqrt{d}}$. We also show that the minimax simple regret is $\smash{\tilde \Theta(d / \sqrt{n})}$ and that this is only achievable by an adaptive algorithm. Our analysis shows that an apparently convincing heuristic for guessing lower bounds can be misleading and that uniform bounds on the information ratio for information-directed sampling are not sufficient for optimal regret.
Haonan Wang, Chang Zhou, Carl Yang, Hongxia Yang, Jingrui He
In this paper, we identify and study an important problem of gradient item
retrieval. We define the problem as retrieving a sequence of items with a
gradual change on a certain attribute, given a reference item and a
modification text. For example, after a customer saw a white dress, she/he
wants to buy a similar one but more floral on it. The extent of "more floral"
is subjective, thus prompting one floral dress is hard to satisfy the
customer's needs. A better way is to present a sequence of products with
increasingly floral attributes based on the white dress, and allow the customer
to select the most satisfactory one from the sequence. Existing item retrieval
methods mainly focus on whether the target items appear at the top of the
retrieved sequence, but ignore the demand for retrieving a sequence of products
with gradual change on a certain attribute. To deal with this problem, we
propose a weakly-supervised method that can learn a disentangled item
representation from user-item interaction data and ground the semantic meaning
of attributes to dimensions of the item representation. Our method takes a
reference item and a modification as a query. During inference, we start from
the reference item and "walk" along the direction of the modification in the
item representation space to retrieve a sequence of items in a gradient manner.
We demonstrate our proposed method can achieve disentanglement through weak
supervision. Besides, we empirically show that an item sequence retrieved by
our method is gradually changed on an indicated attribute and, in the item
retrieval task, our method outperforms existing approaches on three different
datasets.
Authors' comments: Accepted by The International World Wide Web Conference (WWW), 2021
Karim Banawan, Ahmed Arafa, Sennur Ulukus
We introduce the problem of \emph{timely} private information retrieval (PIR)
from $N$ non-colluding and replicated servers. In this problem, a user desires
to retrieve a message out of $M$ messages from the servers, whose contents are
continuously updating. The retrieval process should be executed in a timely
manner such that no information is leaked about the identity of the message. To
assess the timeliness, we use the \emph{age of information} (AoI) metric.
Interestingly, the timely PIR problem reduces to an AoI minimization subject to
PIR constraints under \emph{asymmetric traffic}. We explicitly characterize the
optimal tradeoff between the PIR rate and the AoI metric (peak AoI or average
AoI) for the case of $N=2$, $M=3$. Further, we provide some structural insights
on the general problem with arbitrary $N$, $M$.
Authors' comments: Accepted for presentation in ISIT 2021
Òscar Lorente, Ian Riera, Shauryadeep Chaudhuri, Oriol Catalan, Víctor Casales
To retrieve images based on their content is one of the most studied topics in the field of computer vision. Nowadays, this problem can be addressed using modern techniques such as feature extraction using machine learning, but over the years different classical methods have been developed. In this paper, we implement a query by example retrieval system for finding paintings in a museum image collection using classic computer vision techniques. Specifically, we study the performance of the color, texture, text and feature descriptors in datasets with different perturbations in the images: noise, overlapping text boxes, color corruption and rotation. We evaluate each of the cases using the Mean Average Precision (MAP) metric, and we obtain results that vary between 0.5 and 1.0 depending on the problem conditions.
Xuyang Chang, Liheng Bian, Jun Zhang
High-throughput computational imaging requires efficient processing algorithms to retrieve multi-dimensional and multi-scale information. In computational phase imaging, phase retrieval (PR) is required to reconstruct both amplitude and phase in complex space from intensity-only measurements. The existing PR algorithms suffer from the tradeoff among low computational complexity, robustness to measurement noise and strong generalization on different modalities. In this work, we report an efficient large-scale phase retrieval technique termed as LPR. It extends the plug-and-play generalized-alternating-projection framework from real space to nonlinear complex space. The alternating projection solver and enhancing neural network are respectively derived to tackle the measurement formation and statistical prior regularization. This framework compensates the shortcomings of each operator, so as to realize high-fidelity phase retrieval with low computational complexity and strong generalization. We applied the technique for a series of computational phase imaging modalities including coherent diffraction imaging, coded diffraction pattern imaging, and Fourier ptychographic microscopy. Extensive simulations and experiments validate that the technique outperforms the existing PR algorithms with as much as 17dB enhancement on signal-to-noise ratio, and more than one order-of-magnitude increased running efficiency. Besides, we for the first time demonstrate ultra-large-scale phase retrieval at the 8K level (7680$\times$4320 pixels) in minute-level time.
David J. Brady, Timothy J. Schulz, Chengyu Wang
While characterization of coherent wavefields is essential to laser, x-ray and electron imaging, sensors measure the squared magnitude of the field, rather than the field itself. Holography or phase retrieval must be used to characterize the field. The need for a reference severely restricts the utility of holography. Phase retrieval, in contrast, is theoretically consistent with sensors that directly measure coherent or partially coherent fields with no prior assumptions. Unfortunately, phase retrieval has not yet been successfully implemented for large-scale fields. Here we show that both holography and phase retrieval are capable of quantum-limited coherent signal estimation and we describe phase retrieval strategies that approach the quantum limit for >1 megapixel fields. These strategies rely on group testing using networks of interferometers, such as might be constructed using emerging integrated photonic, plasmonic and/or metamaterial devices. Phase-sensitive sensor planes using such devices could eliminate the need both for lenses and reference signals, creating a path to large aperture diffraction limited laser imaging.
Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni
Entities are at the center of how we represent and aggregate knowledge. For
instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one
per Wikipedia article). The ability to retrieve such entities given a query is
fundamental for knowledge-intensive tasks such as entity linking and
open-domain question answering. Current approaches can be understood as
classifiers among atomic labels, one for each entity. Their weight vectors are
dense entity representations produced by encoding entity meta information such
as their descriptions. This approach has several shortcomings: (i) context and
entity affinity is mainly captured through a vector dot product, potentially
missing fine-grained interactions; (ii) a large memory footprint is needed to
store dense representations when considering large entity sets; (iii) an
appropriately hard set of negative data has to be subsampled at training time.
In this work, we propose GENRE, the first system that retrieves entities by
generating their unique names, left to right, token-by-token in an
autoregressive fashion. This mitigates the aforementioned technical issues
since: (i) the autoregressive formulation directly captures relations between
context and entity name, effectively cross encoding both; (ii) the memory
footprint is greatly reduced because the parameters of our encoder-decoder
architecture scale with vocabulary size, not entity count; (iii) the softmax
loss is computed without subsampling negative data. We experiment with more
than 20 datasets on entity disambiguation, end-to-end entity linking and
document retrieval tasks, achieving new state-of-the-art or very competitive
results while using a tiny fraction of the memory footprint of competing
systems. Finally, we demonstrate that new entities can be added by simply
specifying their names. Code and pre-trained models at
https://github.com/facebookresearch/GENRE.
Authors' comments: Accepted (spotlight) at International Conference on Learning
Representations (ICLR) 2021. Code at
https://github.com/facebookresearch/GENRE. 20 pages, 9 figures, 8 tables
Martin Reiche, Peter Jung
This paper shows how data-driven deep generative models can be utilized to
solve challenging phase retrieval problems, in which one wants to reconstruct a
signal from only few intensity measurements. Classical iterative algorithms are
known to work well if initialized close to the optimum but otherwise suffer
from non-convexity and often get stuck in local minima. We therefore propose
DeepInit Phase Retrieval, which uses regularized gradient descent under a deep
generative data prior to compute a trained initialization for a fast classical
algorithm (e.g. the randomized Kaczmarz method). We empirically show that our
hybrid approach is able to deliver very high reconstruction results at low
sampling rates even when there is significant generator model error.
Conceptually, learned initializations may therefore help to overcome the
non-convexity of the problem by starting classical descent steps closer to the
global optimum. Also, our idea demonstrates superior runtime performance over
conventional gradient-based reconstruction methods. We evaluate our method for
generic measurements and show empirically that it is also applicable to
diffraction-type measurement models which are found in terahertz single-pixel
phase retrieval.
Authors' comments: 9 pages, 12 figures
Sajani Vithana, Karim Banawan, Sennur Ulukus
We investigate the problem of semantic private information retrieval
(semantic PIR). In semantic PIR, a user retrieves a message out of $K$
independent messages stored in $N$ replicated and non-colluding databases
without revealing the identity of the desired message to any individual
database. The messages come with \emph{different semantics}, i.e., the messages
are allowed to have \emph{non-uniform a priori probabilities} denoted by
$(p_i>0,\: i \in [K])$, which are a proxy for their respective popularity of
retrieval, and \emph{arbitrary message sizes} $(L_i,\: i \in [K])$. This is a
generalization of the classical private information retrieval (PIR) problem,
where messages are assumed to have equal a priori probabilities and equal
message sizes. We derive the semantic PIR capacity for general $K$, $N$. The
results show that the semantic PIR capacity depends on the number of databases
$N$, the number of messages $K$, the a priori probability distribution of
messages $p_i$, and the message sizes $L_i$. We present two achievable semantic
PIR schemes: The first one is a deterministic scheme which is based on message
asymmetry. This scheme employs non-uniform subpacketization. The second scheme
is probabilistic and is based on choosing one query set out of multiple options
at random to retrieve the required message without the need for exponential
subpacketization. We derive necessary and sufficient conditions for the
semantic PIR capacity to exceed the classical PIR capacity with equal priors
and sizes. Our results show that the semantic PIR capacity can be larger than
the classical PIR capacity when longer messages have higher popularities.
However, when messages are equal-length, the non-uniform priors cannot be
exploited to improve the retrieval rate over the classical PIR capacity.
Authors' comments: submitted for publication
Yiwei Zhang, Eitan Yaakobi, Tuvi Etzion
A \emph{private proximity retrieval} (\emph{PPR}) scheme is a protocol which allows a user to retrieve the identities of all records in a database that are within some distance $r$ from the user's record $x$. The user's \emph{privacy} at each server is given by the fraction of the record $x$ that is kept private. We assume that each server stores a copy of the database. In this paper, this research is initiated and protocols that offer trade-offs between privacy and computational complexity and storage are studied. In particular, we study the required minimum number of servers by our protocol which provides a given privacy level. Each server gets a query in the protocol and the set of queries form a code. We study the family of codes generated by the set of queries and in particular the minimum number of codewords in such a code which is the minimum number of servers required for the protocol. These codes are closely related to a family of codes known as \emph{covering designs}. We introduce several lower bounds on the sizes of such codes as well as several constructions. This work focuses on the case when the records are binary vectors together with the Hamming distance. Other metrics such as the Johnson metric are also investigated.
Marcello Balduccini, Emily LeBlanc
Information Retrieval (IR) aims at retrieving documents that are most
relevant to a query provided by a user. Traditional techniques rely mostly on
syntactic methods. In some cases, however, links at a deeper semantic level
must be considered. In this paper, we explore a type of IR task in which
documents describe sequences of events, and queries are about the state of the
world after such events. In this context, successfully matching documents and
query requires considering the events' possibly implicit, uncertain effects and
side-effects. We begin by analyzing the problem, then propose an action
language based formalization, and finally automate the corresponding IR task
using Answer Set Programming.
Authors' comments: Under consideration in Theory and Practice of Logic Programming
(TPLP)
Fabio Crestani, Stefano Mizzaro, Ivan Scagnetto
Mobile Information Retrieval (Mobile IR) is a relatively recent branch of
Information Retrieval (IR) that is concerned with enabling users to carry out,
using a mobile device, all the classical IR operations that they were used to
carry out on a desktop. This includes finding content available on local
repositories or on the web in response to a user query, interacting with the
system in an explicit or implicit way, reformulate the query and/or visualise
the content of the retrieved documents, as well as providing relevance
judgments to improve the retrieval process.
This book is structured as follows. Chapter 2 provides a very brief overview
of IR and of Mobile IR, briefly outlining what in Mobile IR is different from
IR. Chapter 3 provides the foundations of Mobile IR, looking at the
characteristics of mobile devices and what they bring to IR, but also looking
at how the concept of relevance changed from standard IR to Mobile IR. Chapter
4 presents an overview of the document collections that are searchable by a
Mobile IR system, and that are somehow different from classical IR ones;
available for experimentation, including collections of data that have become
complementary to Mobile IR. Similarly, Chapter 5 reviews mobile information
needs studies and users log analysis. Chapter 6 reviews studies aimed at
adapting and improving the users interface to the needs of Mobile IR. Chapter
7, instead, reviews work on context awareness, which studies the many aspects
of the user context that Mobile IR employs. Chapter 8 reviews some of
evaluation work done in Mobile IR, highlighting the distinctions with classical
IR from the perspectives of two main IR evaluation methodologies: users studies
and test collections. Finally, Chapter 9 reports the conclusions of this
review, highlighting briefly some trends in Mobile IR that we believe will
drive research in the next few years.
Authors' comments: 116 pages, published in 2017
Hsuan-Yin Lin, Siddhartha Kumar, Eirik Rosnes, Alexandre Graell i Amat, Eitan Yaakobi
Private information retrieval (PIR) protocols make it possible to retrieve a
file from a database without disclosing any information about the identity of
the file being retrieved. These protocols have been rigorously explored from an
information-theoretic perspective in recent years. While existing protocols
strictly impose that no information is leaked on the file's identity, this work
initiates the study of the tradeoffs that can be achieved by relaxing the
requirement of perfect privacy. In case the user is willing to leak some
information on the identity of the retrieved file, we study how the PIR rate,
as well as the upload cost and access complexity, can be improved. For the
particular case of replicated servers, we propose two weakly-private
information retrieval schemes based on two recent PIR protocols and a family of
schemes based on partitioning. Lastly, we compare the performance of the
proposed schemes.
Authors' comments: To be presented at 2019 IEEE International Symposium on Information
Theory (ISIT)
Subhadip Mukherjee, Chandra Sekhar Seelamantula
We address the problem of phase retrieval (PR) from quantized measurements. The goal is to reconstruct a signal from quadratic measurements encoded with a finite precision, which is indeed the case in many practical applications. We develop a rank-1 projection algorithm that recovers the signal subject to ensuring consistency with the measurement, that is, the recovered signal when encoded must yield the same set of measurements that one started with. The rank-1 projection stems from the idea of lifting, originally proposed in the context of PhaseLift. The consistency criterion is enforced using a one-sided quadratic cost. We also determine the probability with which different vectors lead to the same set of quantized measurements, which makes it impossible to resolve them. Naturally, this probability depends on how correlated such vectors are, and how coarsely/finely the measurements get quantized. The proposed algorithm is also capable of incorporating a sparsity constraint on the signal. An analysis of the cost function reveals that it is bounded, both above and below, by functions that are dependent on how well correlated the estimate is with the ground truth. We also derive the Cram\'er-Rao lower bound (CRB) on the achievable reconstruction accuracy. A comparison with the state-of-the- art algorithms shows that the proposed algorithm has a higher reconstruction accuracy and is about 2 to 3 dB away from the CRB. The edge, in terms of the reconstruction signal-to-noise ratio, over the competing algorithms is higher (about 5 to 6 dB) when the quantization is coarse.