Chenglong Bao, George Barbastathis, Hui Ji, Zuowei Shen, Zhengyun Zhang
The mutual intensity and its equivalent phase-space representations quantify
an optical field's state of coherence and are important tools in the study of
light propagation and dynamics, but they can only be estimated indirectly from
measurements through a process called coherence retrieval, otherwise known as
phase-space tomography. As practical considerations often rule out the
availability of a complete set of measurements, coherence retrieval is usually
a challenging high-dimensional ill-posed inverse problem. In this paper, we
propose a trace-regularized optimization model for coherence retrieval and a
provably-convergent adaptive accelerated proximal gradient algorithm for
solving the resulting problem. Applying our model and algorithm to both
simulated and experimental data, we demonstrate an improvement in
reconstruction quality over previous models as well as an increase in
convergence speed compared to existing first-order methods.
Authors' comments: 28 pages, 10 figures, accepted for publication in SIAM Journal on
Imaging Sciences
Veit Elser, Ti-Yen Lan, Tamir Bendory
In recent years, the mathematical and algorithmic aspects of the phase
retrieval problem have received considerable attention. Many papers in this
area mention crystallography as a principal application. In crystallography,
the signal to be recovered is periodic and comprised of atomic distributions
arranged homogeneously in the unit cell of the crystal. The crystallographic
problem is both the leading application and one of the hardest forms of phase
retrieval. We have constructed a graded set of benchmark problems for
evaluating algorithms that perform this type of phase retrieval. The data,
publicly available online, is provided in an easily interpretable format. We
also propose a simple and unambiguous success/failure criterion based on the
actual needs in crystallography. Baseline runtimes were obtained with an
iterative algorithm that is similar but more transparent than those used in
crystallography. Empirically, the runtimes grow exponentially with respect to a
new hardness parameter: the sparsity of the signal autocorrelation. We also
review the algorithms used by the leading software packages. This set of
benchmark problems, we hope, will encourage the development of new algorithms
for the phase retrieval problem in general, and crystallography in particular.
Authors' comments: 27 pages, 10 figures, new references, modified appendix, improved
presentation
Federico Nanni, Bhaskar Mitra, Matt Magnusson, Laura Dietz
Retrieving paragraphs to populate a Wikipedia article is a challenging task. The new TREC Complex Answer Retrieval (TREC CAR) track introduces a comprehensive dataset that targets this retrieval scenario. We present early results from a variety of approaches -- from standard information retrieval methods (e.g., tf-idf) to complex systems that using query expansion using knowledge bases and deep neural networks. The goal is to offer future participants of this track an overview of some promising approaches to tackle this problem.
Bhaskar Mitra, Nick Craswell
Neural ranking models for information retrieval (IR) use shallow or deep neural networks to rank search results in response to a query. Traditional learning to rank models employ machine learning techniques over hand-crafted IR features. By contrast, neural models learn representations of language from raw text that can bridge the gap between query and document vocabulary. Unlike classical IR models, these new machine learning based approaches are data-hungry, requiring large scale training data before they can be deployed. This tutorial introduces basic concepts and intuitions behind neural IR models, and places them in the context of traditional retrieval models. We begin by introducing fundamental concepts of IR and different neural and non-neural approaches to learning vector representations of text. We then review shallow neural IR methods that employ pre-trained neural term embeddings without learning the IR task end-to-end. We introduce deep neural networks next, discussing popular deep architectures. Finally, we review the current DNN models for information retrieval. We conclude with a discussion on potential future directions for neural IR.
Christina Lioma, Birger Larsen, Wei Lu
Typically, every part in most coherent text has some plausible reason for its presence, some function that it performs to the overall semantics of the text. Rhetorical relations, e.g. contrast, cause, explanation, describe how the parts of a text are linked to each other. Knowledge about this socalled discourse structure has been applied successfully to several natural language processing tasks. This work studies the use of rhetorical relations for Information Retrieval (IR): Is there a correlation between certain rhetorical relations and retrieval performance? Can knowledge about a document's rhetorical relations be useful to IR? We present a language model modification that considers rhetorical relations when estimating the relevance of a document to a query. Empirical evaluation of different versions of our model on TREC settings shows that certain rhetorical relations can benefit retrieval effectiveness notably (> 10% in mean average precision over a state-of-the-art baseline).
Walid Shalaby, Wlodek Zadrozny
With the ever increasing number of filed patent applications every year, the need for effective and efficient systems for managing such tremendous amounts of data becomes inevitably important. Patent Retrieval (PR) is considered the pillar of almost all patent analysis tasks. PR is a subfield of Information Retrieval (IR) which is concerned with developing techniques and methods that effectively and efficiently retrieve relevant patent documents in response to a given search request. In this paper we present a comprehensive review on PR methods and approaches. It is clear that, recent successes and maturity in IR applications such as Web search cannot be transferred directly to PR without deliberate domain adaptation and customization. Furthermore, state-of-the-art performance in automatic PR is still around average in terms of recall. These observations motivate the need for interactive search tools which provide cognitive assistance to patent professionals with minimal effort. These tools must also be developed in hand with patent professionals considering their practices and expectations. We additionally touch on related tasks to PR such as patent valuation, litigation, licensing, and highlight potential opportunities and open directions for computational scientists in these domains.
Joani Mitro
This paper functions as a tutorial for individuals interested to enter the
field of information retrieval but wouldn't know where to begin from. It
describes two fundamental yet efficient image retrieval techniques, the first
being k - nearest neighbors (knn) and the second support vector machines(svm).
The goal is to provide the reader with both the theoretical and practical
aspects in order to acquire a better understanding. Along with this tutorial we
have also developed the equivalent software1 using the MATLAB environment in
order to illustrate the techniques, so that the reader can have a hands-on
experience.
Authors' comments: Technical Report
P. Chen, A. Fannjiang, G. Liu
The null vector method, based on a simple linear algebraic concept, is proposed as a solution to the phase retrieval problem. In the case with complex Gaussian random measurement matrices, a non-asymptotic error bound is derived, yielding an asymptotic regime of accurate approximation comparable to that for the spectral vector method.
Antonia Creswell, Anil Anthony Bharath
Generative Adversarial Networks (GAN) are able to learn excellent
representations for unlabelled data which can be applied to image generation
and scene classification. Representations learned by GANs have not yet been
applied to retrieval. In this paper, we show that the representations learned
by GANs can indeed be used for retrieval. We consider heritage documents that
contain unlabelled Merchant Marks, sketch-like symbols that are similar to
hieroglyphs. We introduce a novel GAN architecture with design features that
make it suitable for sketch retrieval. The performance of this sketch-GAN is
compared to a modified version of the original GAN architecture with respect to
simple invariance properties. Experiments suggest that sketch-GANs learn
representations that are suitable for retrieval and which also have increased
stability to rotation, scale and translation compared to the standard GAN
architecture.
Authors' comments: Accepted to ECCV2016 VisArt Workshop
Kalpa Gunaratna
Document retrieval has been an important research problem over many years in the information retrieval community. State-of-the-art techniques utilize various methods in matching documents to a given document including keywords, phrases, and annotations. In this paper, we propose a new approach for document retrieval that utilizes predications (subject-predicate-object triples) extracted from the documents. We represent documents as sets of predications. We measure the similarity between predications to compute the similarity between documents. Our approach utilizes the hierarchical information available in ontologies in computing concept-concept similarity, making the approach flexible. Predication-based document similarity is more precise and forms the basis for a semantically aware document retrieval system. We show that the approach is competitive with an existing state-of-the-art related document retrieval technique in the biomedical domain.
Kinjalk Lochan, Sumanta Chakraborty, T. Padmanabhan
It is generally believed that, when matter collapses to form a black hole,
the complete information about the initial state of the matter cannot be
retrieved by future asymptotic observers, through local measurements. This is
contrary to the expectation from a unitary evolution in quantum theory and
leads to (a version of) the black hole information paradox. Classically,
nothing else, apart from mass, charge and angular momentum is expected to be
revealed to such asymptotic observers after the formation of a black hole.
Semi-classically, black holes evaporate after their formation through the
Hawking radiation. The dominant part of the radiation is expected to be thermal
and hence one cannot know anything about the initial data from the resultant
radiation. However, there can be sources of distortions which make the
radiation non-thermal. Although the distortions are not strong enough to make
the evolution unitary, these distortions carry some part of information
regarding the in-state. In this work, we show how one can decipher the
information about the in-state of the field from these distortions. We show
that the distortions of a particular kind --- which we call {\it non-vacuum
distortions} --- can be used to \emph{fully} reconstruct the initial data. The
asymptotic observer can do this operationally by measuring certain well-defined
observables of the quantum field at late times. We demonstrate that a general
class of in-states encode all their information content in the correlation of
late time out-going modes. Further, using a $1+1$ dimensional CGHS model to
accommodate back-reaction self-consistently, we show that observers can also
infer and track the information content about the initial data, during the
course of evaporation, unambiguously. Implications of such information
extraction are discussed.
Authors' comments: 23 pages; 2 figures
Maura B. Paterson, Douglas R. Stinson, Jalaj Upadhyay
There has been considerable recent interest in "cloud storage" wherein a user
asks a server to store a large file. One issue is whether the user can verify
that the server is actually storing the file, and typically a
challenge-response protocol is employed to convince the user that the file is
indeed being stored correctly. The security of these schemes is phrased in
terms of an extractor which will recover the file given any "proving algorithm"
that has a sufficiently high success probability. This forms the basis of
\emph{proof-of-retrievability} ($\mathsf{PoR}$) systems.
In this paper, we study multiple server $\mathsf{PoR}$ systems. We formalize
security definitions for two possible scenarios: (i) when a threshold of
servers succeed with high enough probability (worst-case) and (ii) when the
average of the success probability of all the servers is above a threshold
(average-case). We also motivate the study of confidentiality of the outsourced
message. We give $\mathsf{M}\mbox{-}\mathsf{PoR}$ schemes which are secure
under both these security definitions and provide reasonable confidentiality
guarantees even when there is no restriction on the computational power of the
servers. We also show how classical statistical techniques used by Paterson,
Stinson and Upadhyay (Journal of Mathematical Cryptology: 7(3)) can be extended
to evaluate whether the responses of the provers are accurate enough to permit
successful extraction. We also look at one specific instantiation of our
construction when instantiated with the unconditionally secure version of the
Shacham-Waters scheme (Asiacrypt, 2008). This scheme gives reasonable security
and privacy guarantee. We show that, in the multi-server setting with
computationally unbounded provers, one can overcome the limitation that the
verifier needs to store as much secret information as the provers.
Authors' comments: 22 pages
Veit Elser
Bit retrieval is the problem of reconstructing a binary sequence from its
periodic autocorrelation, with applications in cryptography and x-ray
crystallography. After defining the problem, with and without noise, we
describe and compare various algorithms for solving it. A geometrical
constraint satisfaction algorithm, relaxed-reflect-reflect, is currently the
best algorithm for noisy bit retrieval.
Authors' comments: 42 pages, 15 figures
Irène Waldspurger
We describe a new algorithm to solve a particular phase retrieval problem,
that has wide applications in audio processing: the reconstruction of a
function from its scalogram, that is from the modulus of its wavelet transform.
It is a multiscale iterative algorithm. To efficiently propagate phase
information from low to high frequencies, it uses an equivalent formulation of
the phase retrieval problem that involves the holomorphic extension of the
wavelet transform. Our numerical results, on audio and non-audio signals, show
that reconstruction is precise and stable to noise. The algorithm has a linear
complexity in the size of the signal, up to logarithmic factors, and can thus
be used on large signals.
Authors' comments: Analysis of the algorithm for Cauchy wavelets + various updates
Emad Elabd, Eissa Alshari, Hatem Abdulkader
Arabic language is one of the most widely spoken languages. This language has
a complex morphological structure and is considered as one of the most prolific
languages in terms of article linguistic. Therefore, Arabic Information
Retrieval (AIR) models need specific techniques to deal with this complex
morphological structure. This paper aims to develop an integrate AIR
frameworks. It lists and analysis the different Information Retrieval (IR)
methods and techniques such as query processing, stemming and indexing which
are used in AIR systems. We conclude that AIR frameworks have a weakness to
deal with semantic in term of indexing, Boolean model, Latent Semantic Analysis
(LSA), Latent Semantic Index (LSI) and semantic ranking. Therefore, semantic
Boolean IR framework is proposed in this paper. This model is implemented and
the precision, recall and run time are measured and compared with the
traditional IR model.
Authors' comments: in The International Arab Journal of Information Technology, Vol. 12,
No. 3, May 2015
Eissa M. Alshari
The continuous increasing in the amount of the published and stored information requires a special Information Retrieval (IR) frameworks to search and get information accurately and speedily. Currently, keywords-based techniques are commonly used in information retrieval. However, a major drawback of the keywords approach is its inability of handling the polysemy and synonymy phenomenon of the natural language. For instance, the meanings of words and understanding of concepts differ in different communities. Same word use for different concepts (polysemy) or use different words for the same concept (synonymy). Most of information retrieval frameworks have a weakness to deal with the semantics of the words in term of (indexing, Boolean model, Latent Semantic Analysis (LSA) , Latent semantic Index (LSI) and semantic ranking, etc.). Traditional Arabic Information Retrieval (AIR) models performance insufficient with semantic queries, which deal with not only the keywords but also with the context of these keywords. Therefore, there is a need for a semantic information retrieval model with a semantic index structure and ranking algorithm based on semantic index.
Sara Botelho-Andrade, Peter G. Casazza, Hanh Van Nguyen, Janet C. Tremain
In 2006, Balan/Casazza/Edidin \cite{BCE} introduced the frame theoretic study of phaseless reconstruction. Since then, this has turned into a very active area of research. Over the years, many people have replaced the term {\it phaseless reconstruction} with {\it phase retrieval}. Casazza then asked: {\it Are these really the same?} In this paper, we will show that phase retrieval is equivalent to phaseless reconstruction. We then show, more generally, that phase retrieval by projections is equivalent to phaseless reconstruction by projections. Finally, we study {\it weak phase retrieval} and discover that it is very different from phaseless reconstruction.
Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, Shuicheng Yan
In this paper, we investigate the cross-media retrieval between images and
text, i.e., using image to search text (I2T) and using text to search images
(T2I). Existing cross-media retrieval methods usually learn one couple of
projections, by which the original features of images and text can be projected
into a common latent space to measure the content similarity. However, using
the same projections for the two different retrieval tasks (I2T and T2I) may
lead to a tradeoff between their respective performances, rather than their
best performances. Different from previous works, we propose a
modality-dependent cross-media retrieval (MDCR) model, where two couples of
projections are learned for different cross-media retrieval tasks instead of
one couple of projections. Specifically, by jointly optimizing the correlation
between images and text and the linear regression from one modal space (image
or text) to the semantic space, two couples of mappings are learned to project
images and text from their original feature spaces into two common latent
subspaces (one for I2T and the other for T2I). Extensive experiments show the
superiority of the proposed MDCR compared with other methods. In particular,
based the 4,096 dimensional convolutional neural network (CNN) visual feature
and 100 dimensional LDA textual feature, the mAP of the proposed method
achieves 41.5\%, which is a new state-of-the-art performance on the Wikipedia
dataset.
Authors' comments: in ACM Transactions on Intelligent Systems and Technology
Irena Bojarovska, Axel Flinth
Compressed sensing investigates the recovery of sparse signals from linear measurements. But often, in a wide range of applications, one is given only the absolute values (squared) of the linear measurements. Recovering such signals (not necessarily sparse) is known as the phase retrieval problem. We consider this problem in the case when the measurements are time-frequency shifts of a suitably chosen generator, i.e. coming from a Gabor frame. We prove an easily checkable injectivity condition for recovery of any signal from all $N^2$ time-frequency shifts, and for recovery of sparse signals, when only some of those measurements are given.
Gil Ben-Artzi, Michael Werman, Shmuel Peleg
We introduce a simple and effective method for retrieval of videos showing a specific event, even when the videos of that event were captured from significantly different viewpoints. Appearance-based methods fail in such cases, as appearances change with large changes of viewpoints. Our method is based on a pixel-based feature, "motion barcode", which records the existence/non-existence of motion as a function of time. While appearance, motion magnitude, and motion direction can vary greatly between disparate viewpoints, the existence of motion is viewpoint invariant. Based on the motion barcode, a similarity measure is developed for videos of the same event taken from very different viewpoints. This measure is robust to occlusions common under different viewpoints, and can be computed efficiently. Event retrieval is demonstrated using challenging videos from stationary and hand held cameras.