Dani Kogan, Yonina C. Eldar, Dan Oron
The recovery of a signal from the magnitude of its Fourier transform, also known as phase retrieval, is of fundamental importance in many scientific fields. It is well known that due to the loss of Fourier phase the problem in 1D is ill-posed. Without further constraints, there is no unique solution to the problem. In contrast, uniqueness up to trivial ambiguities very often exists in higher dimensions, with mild constraints on the input. In this paper we focus on the 2D phase retrieval problem and provide insight into this uniqueness property by exploring the connection between the 2D and 1D formulations. In particular, we show that 2D phase retrieval can be cast as a 1D problem with additional constraints, which limit the solution space. We then prove that only one additional constraint is sufficient to reduce the many feasible solutions in the 1D setting to a unique solution for almost all signals. These results allow to obtain an analytical approach (with combinatorial complexity) to solve the 2D phase retrieval problem when it is unique.
Mina Nouredanesh, Hamid R. Tizhoosh, Ershad Banijamali
In recent years, advances in medical imaging have led to the emergence of
massive databases, containing images from a diverse range of modalities. This
has significantly heightened the need for automated annotation of the images on
one side, and fast and memory-efficient content-based image retrieval systems
on the other side. Binary descriptors have recently gained more attention as a
potential vehicle to achieve these goals. One of the recently introduced binary
descriptors for tagging of medical images are Radon barcodes (RBCs) that are
driven from Radon transform via local thresholding. Gabor transform is also a
powerful transform to extract texture-based information. Gabor features have
exhibited robustness against rotation, scale, and also photometric
disturbances, such as illumination changes and image noise in many
applications. This paper introduces Gabor Barcodes (GBCs), as a novel framework
for the image annotation. To find the most discriminative GBC for a given query
image, the effects of employing Gabor filters with different parameters, i.e.,
different sets of scales and orientations, are investigated, resulting in
different barcode lengths and retrieval performances. The proposed method has
been evaluated on the IRMA dataset with 193 classes comprising of 12,677 x-ray
images for indexing, and 1,733 x-rays images for testing. A total error score
as low as $351$ ($\approx 80\%$ accuracy for the first hit) was achieved.
Authors' comments: To appear in proceedings of The 2016 IEEE International Conference on
Image Processing (ICIP 2016), Sep 25-28, 2016, Phoenix, Arizona, USA
Felix Krahmer, Yi-Kai Liu
In the context of the phase retrieval problem, it is known that certain
natural classes of measurements, such as Fourier measurements and random
Bernoulli measurements, do not lead to the unique reconstruction of all
possible signals, even in combination with certain practically feasible random
masks. To avoid this difficulty, the analysis is often restricted to
measurement ensembles (or masks) that satisfy a small-ball probability
condition, in order to ensure that the reconstruction is unique.
This paper shows a complementary result: for random Bernoulli measurements,
there is still a large class of signals that can be reconstructed uniquely,
namely those signals that are "non-peaky." In fact, this result is much more
general: it holds for random measurements sampled from any subgaussian
distribution D, without any small-ball conditions. This is demonstrated in two
ways: first, a proof of stability and uniqueness, and second, a uniform
recovery guarantee for the PhaseLift algorithm. In all of these cases, the
number of measurements m approaches the information-theoretic lower bound.
Finally, for random Bernoulli measurements with erasures, it is shown that
PhaseLift achieves uniform recovery of all signals (including peaky ones).
Authors' comments: 15 pages; v3: to appear in IEEE Trans. Info. Theory; v2: minor
revisions and clarifications; presented in part at the 2015 SampTA
Conference, see http://doi.org/10.1109/SAMPTA.2015.7148923 and
http://doi.org/10.1109/SAMPTA.2015.7148966
Raphael R. Toledo, George Danezis, Ian Goldberg
Private Information Retrieval (PIR), despite being well studied, is computationally costly and arduous to scale. We explore lower-cost relaxations of information-theoretic PIR, based on dummy queries, sparse vectors, and compositions with an anonymity system. We prove the security of each scheme using a flexible differentially private definition for private queries that can capture notions of imperfect privacy. We show that basic schemes are weak, but some of them can be made arbitrarily safe by composing them with large anonymity systems.
Qingqun Ning, Jianke Zhu, Zhiyuan Zhong, Steven C. H. Hoi, Chun Chen
Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional
feature indexing and retrieval is the crux of large-scale image retrieval. A
recent promising technique is Product Quantization, which attempts to index
high-dimensional image features by decomposing the feature space into a
Cartesian product of low dimensional subspaces and quantizing each of them
separately. Despite the promising results reported, their quantization approach
follows the typical hard assignment of traditional quantization methods, which
may result in large quantization errors and thus inferior search performance.
Unlike the existing approaches, in this paper, we propose a novel approach
called Sparse Product Quantization (SPQ) to encoding the high-dimensional
feature vectors into sparse representation. We optimize the sparse
representations of the feature vectors by minimizing their quantization errors,
making the resulting representation is essentially close to the original data
in practice. Experiments show that the proposed SPQ technique is not only able
to compress data, but also an effective encoding technique. We obtain
state-of-the-art results for ANN search on four public image datasets and the
promising results of content-based image retrieval further validate the
efficacy of our proposed method.
Authors' comments: 12 pages
Tanaka Kanji
Change detection, i.e., anomaly detection from local maps built by a mobile
robot at multiple different times, is a challenging problem to solve in
practice. Most previous work either cannot be applied to scenarios where the
size of the map collection is large, or simply assumed that the robot
self-location is globally known. In this paper, we tackle the problem of
simultaneous self-localization and change detection, by reformulating the
problem as a map retrieval problem, and propose a local map descriptor with a
compressed bag-of-words (BoW) structure as a scalable solution. We make the
following contributions. (1) To enable a direct comparison of the spatial
layout of visual features between different local maps, the origin of the local
map coordinate (termed "viewpoint") is planned by scene parsing and determined
by our "viewpoint planner" to be invariant against small variations in
self-location and changes, aiming at providing similar viewpoints for similar
scenes (i.e., the relevant map pair). (2) We extend the BoW model to enable the
use of not only the appearance (e.g., polestar) but also the spatial layout
(e.g., spatial pyramid) of visual features with respect to the planned
viewpoint. The key observation is that the planned viewpoint (i.e., the origin
of local map coordinate) acts as a pseudo viewpoint that is usually required by
spatial BoW (e.g., SPM) and also by anomaly detection (e.g., NN-d, LOF). (3)
Experimental results on a challenging "loop-closing" scenario show that the
proposed method outperforms previous BoW methods in self-localization, and
furthermore, that the use of both appearance and pose information in change
detection produces better results than the use of either information alone.
Authors' comments: 8 pages, 7 figures, Draft of a paper submitted to an International
Conference
Hua Sun, Syed A. Jafar
In the private information retrieval (PIR) problem a user wishes to retrieve, as efficiently as possible, one out of $K$ messages from $N$ non-communicating databases (each holds all $K$ messages) while revealing nothing about the identity of the desired message index to any individual database. The information theoretic capacity of PIR is the maximum number of bits of desired information that can be privately retrieved per bit of downloaded information. For $K$ messages and $N$ databases, we show that the PIR capacity is $(1+1/N+1/N^2+\cdots+1/N^{K-1})^{-1}$. A remarkable feature of the capacity achieving scheme is that if we eliminate any subset of messages (by setting the message symbols to zero), the resulting scheme also achieves the PIR capacity for the remaining subset of messages.
Ju Sun, Qing Qu, John Wright
Can we recover a complex signal from its Fourier magnitudes? More generally,
given a set of $m$ measurements, $y_k = |\mathbf a_k^* \mathbf x|$ for $k = 1,
\dots, m$, is it possible to recover $\mathbf x \in \mathbb{C}^n$ (i.e.,
length-$n$ complex vector)? This **generalized phase retrieval** (GPR) problem
is a fundamental task in various disciplines, and has been the subject of much
recent investigation. Natural nonconvex heuristics often work remarkably well
for GPR in practice, but lack clear theoretical explanations. In this paper, we
take a step towards bridging this gap. We prove that when the measurement
vectors $\mathbf a_k$'s are generic (i.i.d. complex Gaussian) and the number of
measurements is large enough ($m \ge C n \log^3 n$), with high probability, a
natural least-squares formulation for GPR has the following benign geometric
structure: (1) there are no spurious local minimizers, and all global
minimizers are equal to the target signal $\mathbf x$, up to a global phase;
and (2) the objective function has a negative curvature around each saddle
point. This structure allows a number of iterative optimization methods to
efficiently find a global minimizer, without special initialization. To
corroborate the claim, we describe and analyze a second-order trust-region
algorithm.
Authors' comments: 61 pages, 5 figures. A short version can be found here
http://sunju.org/docs/PR_G4_16.pdf . Revised according to reviewers' feedback
Sreyasi Nag Chowdhury, Mateusz Malinowski, Andreas Bulling, Mario Fritz
The widespread integration of cameras in hand-held and head-worn devices as
well as the ability to share content online enables a large and diverse visual
capture of the world that millions of users build up collectively every day. We
envision these images as well as associated meta information, such as GPS
coordinates and timestamps, to form a collective visual memory that can be
queried while automatically taking the ever-changing context of mobile users
into account. As a first step towards this vision, in this work we present
Xplore-M-Ego: a novel media retrieval system that allows users to query a
dynamic database of images and videos using spatio-temporal natural language
queries. We evaluate our system using a new dataset of real user queries as
well as through a usability study. One key finding is that there is a
considerable amount of inter-user variability, for example in the resolution of
spatial relations in natural language utterances. We show that our retrieval
system can cope with this variability using personalisation through an online
learning-based retrieval formulation.
Authors' comments: 8 pages, 9 figures, 1 table
Andreas M. Tillmann, Yonina C. Eldar, Julien Mairal
We propose a new algorithm to learn a dictionary for reconstructing and sparsely encoding signals from measurements without phase. Specifically, we consider the task of estimating a two-dimensional image from squared-magnitude measurements of a complex-valued linear transformation of the original image. Several recent phase retrieval algorithms exploit underlying sparsity of the unknown signal in order to improve recovery performance. In this work, we consider such a sparse signal prior in the context of phase retrieval, when the sparsifying dictionary is not known in advance. Our algorithm jointly reconstructs the unknown signal - possibly corrupted by noise - and learns a dictionary such that each patch of the estimated image can be sparsely represented. Numerical experiments demonstrate that our approach can obtain significantly better reconstructions for phase retrieval problems with noise than methods that cannot exploit such "hidden" sparsity. Moreover, on the theoretical side, we provide a convergence result for our method.
Hua Sun, Syed A. Jafar
Blind interference alignment (BIA) refers to interference alignment schemes that are designed only based on channel coherence pattern knowledge at the transmitters (the "blind" transmitters do not know the exact channel values). Private information retrieval (PIR) refers to the problem where a user retrieves one out of K messages from N non-communicating databases (each holds all K messages) without revealing anything about the identity of the desired message index to any individual database. In this paper, we identify an intriguing connection between PIR and BIA. Inspired by this connection, we characterize the information theoretic optimal download cost of PIR, when we have K = 2 messages and the number of databases, N, is arbitrary.
Jameson Cahill, Peter G. Casazza, Ingrid Daubechies
The main result of this paper states that phase retrieval in infinite-dimensional Hilbert spaces is never uniformly stable, in sharp contrast to the finite dimensional setting in which phase retrieval is always stable. This leads us to derive stability results for signals depending on how well they are approximated by finite expansions.
Marc Sloan, Jun Wang
Theoretical frameworks like the Probability Ranking Principle and its more
recent Interactive Information Retrieval variant have guided the development of
ranking and retrieval algorithms for decades, yet they are not capable of
helping us model problems in Dynamic Information Retrieval which exhibit the
following three properties; an observable user signal, retrieval over multiple
stages and an overall search intent. In this paper a new theoretical framework
for retrieval in these scenarios is proposed. We derive a general dynamic
utility function for optimizing over these types of tasks, that takes into
account the utility of each stage and the probability of observing user
feedback. We apply our framework to experiments over TREC data in the dynamic
multi page search scenario as a practical demonstration of its effectiveness
and to frame the discussion of its use, its limitations and to compare it
against the existing frameworks.
Authors' comments: ACM SIGIR International Conference on the Theory of Information
Retrieval (ICTIR), 10 pages, 4 figures, 2 algorithms, 3 tables. in
Proceedings of the 2015 International Conference on The Theory of Information
Retrieval
Tao Lei, Hrishikesh Joshi, Regina Barzilay, Tommi Jaakkola, Katerina Tymoshenko, Alessandro Moschitti, Lluis Marquez
Question answering forums are rapidly growing in size with no effective
automated ability to refer to and reuse answers already available for previous
posted questions. In this paper, we develop a methodology for finding
semantically related questions. The task is difficult since 1) key pieces of
information are often buried in extraneous details in the question body and 2)
available annotations on similar questions are scarce and fragmented. We design
a recurrent and convolutional model (gated convolution) to effectively map
questions to their semantic representations. The models are pre-trained within
an encoder-decoder framework (from body to title) on the basis of the entire
raw corpus, and fine-tuned discriminatively from limited annotations. Our
evaluation demonstrates that our model yields substantial gains over a standard
IR baseline and various neural network architectures (including CNNs, LSTMs and
GRUs).
Authors' comments: NAACL 2016
Philipp Walk, Henning Becker, Peter Jung
Pilot-aided channel estimation is nowadays a standard component in each
wireless receiver enabling coherent transmission of complex-valued
constellations, only affected by noise and interference. Whenever these
disturbances are sufficiently small and long data frames are used, high data
rates can be achieved and the resource overhead due to the pilots vanishes
asymptotically. On the other, it is expected that for the next generation of
mobile networks not only data rate is in the main focus but also low latency,
short and sporadic messages, massive connectivity, distributed & adhoc
processing and robustness with respect to asynchronism. Therefore a review of
several well-established principles in communication has been started already.
A particular implication when using complex-valued pilots is that these values
have to be known at the receiver and therefore these resources can not be used
simultaneously for user data. For an OFDM-like multicarrier scheme this means
that pilot tones (usually placed equidistantly according to the Nyquist
theorem) are allocated with globally known amplitudes and phases to reconstruct
the channel impulse response. Phases are designed and allocated globally which
is in contrast to a distributed infrastructure. In this work we present
therefore a new phaseless pilot scheme where only pilot amplitudes need to be
known at the receiver, i.e., phases are available again and can be used for
various other purposes. The idea is based on a phase retrieval result for
symmetrized and zero-padded magnitude Fourier measurements obtained by two of
the authors. The phases on the pilot tones can now be used to carry additional
user-specific data or compensate for other signal characteristics, like the
PAPR.
Authors' comments: 8 pages, 2 figures, presented on Asilomar 2015
Tianyu Qiu, Prabhu Babu, Daniel P. Palomar
This paper considers the phase retrieval problem in which measurements consist of only the magnitude of several linear measurements of the unknown, e.g., spectral components of a time sequence. We develop low-complexity algorithms with superior performance based on the majorization-minimization (MM) framework. The proposed algorithms are referred to as PRIME: Phase Retrieval vIa the Majorization-minimization techniquE. They are preferred to existing benchmark methods since at each iteration a simple surrogate problem is solved with a closed-form solution that monotonically decreases the original objective function. In total, four algorithms are proposed using different majorization-minimization techniques. Experimental results validate that our algorithms outperform existing methods in terms of successful recovery and mean square error under various settings.
Shelby Kimmel, Yi-Kai Liu
We consider a variant of the phase retrieval problem, where vectors are
replaced by unitary matrices, i.e., the unknown signal is a unitary matrix U,
and the measurements consist of squared inner products |Tr(C*U)|^2 with unitary
matrices C that are chosen by the observer. This problem has applications to
quantum process tomography, when the unknown process is a unitary operation.
We show that PhaseLift, a convex programming algorithm for phase retrieval,
can be adapted to this matrix setting, using measurements that are sampled from
unitary 4- and 2-designs. In the case of unitary 4-design measurements, we show
that PhaseLift can reconstruct all unitary matrices, using a near-optimal
number of measurements. This extends previous work on PhaseLift using spherical
4-designs.
In the case of unitary 2-design measurements, we show that PhaseLift still
works pretty well on average: it recovers almost all signals, up to a constant
additive error, using a near-optimal number of measurements. These 2-design
measurements are convenient for quantum process tomography, as they can be
implemented via randomized benchmarking techniques. This is the first positive
result on PhaseLift using 2-designs.
Authors' comments: 21 pages; v3: minor revisions, to appear at SampTA 2017; v2:
rewritten to focus on phase retrieval, with new title, improved error bounds,
and numerics; v1: original version, titled "Quantum Compressed Sensing Using
2-Designs"
Kishore Jaganathan, Yonina C. Eldar, Babak Hassibi
The problem of phase retrieval is a classic one in optics and arises when one is interested in recovering an unknown signal from the magnitude (intensity) of its Fourier transform. While there have existed quite a few approaches to phase retrieval, recent developments in compressed sensing and convex optimization-based signal recovery have inspired a host of new ones. This work presents an overview of these approaches. Since phase retrieval, by its very nature, is ill-posed, to make the problem meaningful one needs to either assume prior structure on the signal (e.g., sparsity) or obtain additional measurements (e.g., masks, structured illuminations). For both the cases, we review conditions for the identifiability of the signal, as well as practical algorithms for signal recovery. In particular, we demonstrate that it is possible to robustly and efficiently identify an unknown signal solely from phaseless Fourier measurements, a fact with potentially far-reaching implications.
Artem Babenko, Victor Lempitsky
Several recent works have shown that image descriptors produced by deep
convolutional neural networks provide state-of-the-art performance for image
classification and retrieval problems. It has also been shown that the
activations from the convolutional layers can be interpreted as local features
describing particular image regions. These local features can be aggregated
using aggregation approaches developed for local features (e.g. Fisher
vectors), thus providing new powerful global descriptors.
In this paper we investigate possible ways to aggregate local deep features
to produce compact global descriptors for image retrieval. First, we show that
deep features and traditional hand-engineered features have quite different
distributions of pairwise similarities, hence existing aggregation methods have
to be carefully re-evaluated. Such re-evaluation reveals that in contrast to
shallow features, the simple aggregation method based on sum pooling provides
arguably the best performance for deep convolutional features. This method is
efficient, has few parameters, and bears little risk of overfitting when e.g.
learning the PCA matrix. Overall, the new compact global descriptor improves
the state-of-the-art on four common benchmarks considerably.
Authors' comments: accepted for ICCV 2015
Yusuke Matsui, Kota Ito, Yuji Aramaki, Toshihiko Yamasaki, Kiyoharu Aizawa
Manga (Japanese comics) are popular worldwide. However, current e-manga
archives offer very limited search support, including keyword-based search by
title or author, or tag-based categorization. To make the manga search
experience more intuitive, efficient, and enjoyable, we propose a content-based
manga retrieval system. First, we propose a manga-specific image-describing
framework. It consists of efficient margin labeling, edge orientation histogram
feature description, and approximate nearest-neighbor search using product
quantization. Second, we propose a sketch-based interface as a natural way to
interact with manga content. The interface provides sketch-based querying,
relevance feedback, and query retouch. For evaluation, we built a novel dataset
of manga images, Manga109, which consists of 109 comic books of 21,142 pages
drawn by professional manga artists. To the best of our knowledge, Manga109 is
currently the biggest dataset of manga images available for research. We
conducted a comparative study, a localization evaluation, and a large-scale
qualitative study. From the experiments, we verified that: (1) the retrieval
accuracy of the proposed method is higher than those of previous methods; (2)
the proposed method can localize an object instance with reasonable runtime and
accuracy; and (3) sketch querying is useful for manga search.
Authors' comments: 13 pages