Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, Shuicheng Yan
In this paper, we investigate the cross-media retrieval between images and
text, i.e., using image to search text (I2T) and using text to search images
(T2I). Existing cross-media retrieval methods usually learn one couple of
projections, by which the original features of images and text can be projected
into a common latent space to measure the content similarity. However, using
the same projections for the two different retrieval tasks (I2T and T2I) may
lead to a tradeoff between their respective performances, rather than their
best performances. Different from previous works, we propose a
modality-dependent cross-media retrieval (MDCR) model, where two couples of
projections are learned for different cross-media retrieval tasks instead of
one couple of projections. Specifically, by jointly optimizing the correlation
between images and text and the linear regression from one modal space (image
or text) to the semantic space, two couples of mappings are learned to project
images and text from their original feature spaces into two common latent
subspaces (one for I2T and the other for T2I). Extensive experiments show the
superiority of the proposed MDCR compared with other methods. In particular,
based the 4,096 dimensional convolutional neural network (CNN) visual feature
and 100 dimensional LDA textual feature, the mAP of the proposed method
achieves 41.5\%, which is a new state-of-the-art performance on the Wikipedia
dataset.
Authors' comments: in ACM Transactions on Intelligent Systems and Technology
Irena Bojarovska, Axel Flinth
Compressed sensing investigates the recovery of sparse signals from linear measurements. But often, in a wide range of applications, one is given only the absolute values (squared) of the linear measurements. Recovering such signals (not necessarily sparse) is known as the phase retrieval problem. We consider this problem in the case when the measurements are time-frequency shifts of a suitably chosen generator, i.e. coming from a Gabor frame. We prove an easily checkable injectivity condition for recovery of any signal from all $N^2$ time-frequency shifts, and for recovery of sparse signals, when only some of those measurements are given.
Gil Ben-Artzi, Michael Werman, Shmuel Peleg
We introduce a simple and effective method for retrieval of videos showing a specific event, even when the videos of that event were captured from significantly different viewpoints. Appearance-based methods fail in such cases, as appearances change with large changes of viewpoints. Our method is based on a pixel-based feature, "motion barcode", which records the existence/non-existence of motion as a function of time. While appearance, motion magnitude, and motion direction can vary greatly between disparate viewpoints, the existence of motion is viewpoint invariant. Based on the motion barcode, a similarity measure is developed for videos of the same event taken from very different viewpoints. This measure is robust to occlusions common under different viewpoints, and can be computed efficiently. Event retrieval is demonstrated using challenging videos from stationary and hand held cameras.
Daniel S. Weller, Ayelet Pnueli, Gilad Divon, Ori Radzyner, Yonina C. Eldar, Jeffrey A. Fessler
We propose a general framework for reconstructing transform-sparse images
from undersampled (squared)-magnitude data corrupted with outliers. This
framework is implemented using a multi-layered approach, combining multiple
initializations (to address the nonconvexity of the phase retrieval problem),
repeated minimization of a convex majorizer (surrogate for a nonconvex
objective function), and iterative optimization using the alternating
directions method of multipliers. Exploiting the generality of this framework,
we investigate using a Laplace measurement noise model better adapted to
outliers present in the data than the conventional Gaussian noise model. Using
simulations, we explore the sensitivity of the method to both the
regularization and penalty parameters. We include 1D Monte Carlo and 2D image
reconstruction comparisons with alternative phase retrieval algorithms. The
results suggest the proposed method, with the Laplace noise model, both
increases the likelihood of correct support recovery and reduces the mean
squared error from measurements containing outliers. We also describe exciting
extensions made possible by the generality of the proposed framework, including
regularization using analysis-form sparsity priors that are incompatible with
many existing approaches.
Authors' comments: 11 pages, 9 figures
Luis Marujo, José Portêlo, David Martins de Matos, João P. Neto, Anatole Gershman, Jaime Carbonell, Isabel Trancoso, Bhiksha Raj
State-of-the-art important passage retrieval methods obtain very good
results, but do not take into account privacy issues. In this paper, we present
a privacy preserving method that relies on creating secure representations of
documents. Our approach allows for third parties to retrieve important passages
from documents without learning anything regarding their content. We use a
hashing scheme known as Secure Binary Embeddings to convert a key phrase and
bag-of-words representation to bit strings in a way that allows the computation
of approximate distances, instead of exact ones. Experiments show that our
secure system yield similar results to its non-private counterpart on both
clean text and noisy speech recognized text.
Authors' comments: Secure Passage Retrieval, Important Passage Retrieval, KP-Centrality,
Secure Binary Embeddings, Data Privacy, Automatic Key Phrase Extraction,
Proceedings of SIGIR 2014 Workshop Privacy-Preserving IR: When Information
Retrieval Meets Privacy and Security
Gonzalo Navarro, Simon J. Puglisi, Jouni Sirén
Document retrieval aims at finding the most important documents where a
pattern appears in a collection of strings. Traditional pattern-matching
techniques yield brute-force document retrieval solutions, which has motivated
the research on tailored indexes that offer near-optimal performance. However,
an experimental study establishing which alternatives are actually better than
brute force, and which perform best depending on the collection
characteristics, has not been carried out. In this paper we address this
shortcoming by exploring the relationship between the nature of the underlying
collection and the performance of current methods. Via extensive experiments we
show that established solutions are often beaten in practice by brute-force
alternatives. We also design new methods that offer superior time/space
trade-offs, particularly on repetitive collections.
Authors' comments: Accepted to ESA 2014. Implementation and experiments at
http://www.cs.helsinki.fi/group/suds/rlcsa/
Artem Babenko, Anton Slesarev, Alexandr Chigorin, Victor Lempitsky
It has been shown that the activations invoked by an image within the top
layers of a large convolutional neural network provide a high-level descriptor
of the visual content of the image. In this paper, we investigate the use of
such descriptors (neural codes) within the image retrieval application. In the
experiments with several standard retrieval benchmarks, we establish that
neural codes perform competitively even when the convolutional neural network
has been trained for an unrelated classification task (e.g.\ Image-Net). We
also evaluate the improvement in the retrieval performance of neural codes,
when the network is retrained on a dataset of images that are similar to images
encountered at test time.
We further evaluate the performance of the compressed neural codes and show
that a simple PCA compression provides very good short codes that give
state-of-the-art accuracy on a number of datasets. In general, neural codes
turn out to be much more resilient to such compression in comparison other
state-of-the-art descriptors. Finally, we show that discriminative
dimensionality reduction trained on a dataset of pairs of matched photographs
improves the performance of PCA-compressed neural codes even further. Overall,
our quantitative experiments demonstrate the promise of neural codes as visual
descriptors for image retrieval.
Authors' comments: to appear at ECCV 2014
Dustin G. Mixon
Consider a scenario in which an unknown signal is transformed by a known
linear operator, and then the pointwise absolute value of the unknown output
function is reported. This scenario appears in several applications, and the
goal is to recover the unknown signal -- this is called phase retrieval. Phase
retrieval has been a popular subject of research in the last few years, both in
determining whether complete information is available with a given linear
operator, and in finding efficient and stable phase retrieval algorithms in the
cases where complete information is available. Interestingly, there are a few
ways to measure information completeness, and each way appears to be governed
by a phase transition of sorts. This chapter will survey the state of the art
with some of these phase transitions, and identify a few open problems for
further research.
Authors' comments: Book chapter, survey of recent literature, submitted to Excursions in
Harmonic Analysis: The February Fourier Talks at the Norbert Wiener Center
Yang Wang, Zhiqiang Xu
The aim of this paper is to build up the theoretical framework for the
recovery of sparse signals from the magnitude of the measurement. We first
investigate the minimal number of measurements for the success of the recovery
of sparse signals without the phase information. We completely settle the
minimality question for the real case and give a lower bound for the complex
case. We then study the recovery performance of the $\ell_1$ minimization. In
particular, we present the null space property which, to our knowledge, is the
first sufficient and necessary condition for the success of $\ell_1$
minimization for $k$-sparse phase retrievable.
Authors' comments: 14 pages
Radu Balan
In this paper we study the property of phase retrievability by redundant
sysems of vectors under perturbations of the frame set. Specifically we show
that if a set $\fc$ of $m$ vectors in the complex Hilbert space of dimension n
allows for vector reconstruction from magnitudes of its coefficients, then
there is a perturbation bound $\rho$ so that any frame set within $\rho$ from
$\fc$ has the same property. In particular this proves the recent construction
in \cite{BH13} is stable under perturbations. By the same token we reduce the
critical cardinality conjectured in \cite{BCMN13a} to proving a stability
result for non phase-retrievable frames.
Authors' comments: 13 pages, presented at SPIE 2013 conference
Gonzalo Navarro, Yakov Nekrich
Let $\mathcal{D}$ be a collection of $D$ documents, which are strings over an alphabet of size $\sigma$, of total length $n$. We describe a data structure that uses linear space and and reports $k$ most relevant documents that contain a query pattern $P$, which is a string of length $p$, in time $O(p/\log_\sigma n+k)$, which is optimal in the RAM model in the general case where $\lg D = \Theta(\log n)$, and involves a novel RAM-optimal suffix tree search. Our construction supports an ample set of important relevance measures... [clip] When $\lg D = o(\log n)$, we show how to reduce the space of the data structure from $O(n\log n)$ to $O(n(\log\sigma+\log D+\log\log n))$ bits... [clip] We also consider the dynamic scenario, where documents can be inserted and deleted from the collection. We obtain linear space and query time $O(p(\log\log n)^2/\log_\sigma n+\log n + k\log\log k)$, whereas insertions and deletions require $O(\log^{1+\epsilon} n)$ time per symbol, for any constant $\epsilon>0$. Finally, we consider an extended static scenario where an extra parameter $par(P,d)$ is defined, and the query must retrieve only documents $d$ such that $par(P,d)\in [\tau_1,\tau_2]$, where this range is specified at query time. We solve these queries using linear space and $O(p/\log_\sigma n + \log^{1+\epsilon} n + k\log^\epsilon n)$ time, for any constant $\epsilon>0$. Our technique is to translate these top-$k$ problems into multidimensional geometric search problems. As an additional bonus, we describe some improvements to those problems.
Praneeth Netrapalli, Prateek Jain, Sujay Sanghavi
Phase retrieval problems involve solving linear equations, but with missing
sign (or phase, for complex numbers) information. More than four decades after
it was first proposed, the seminal error reduction algorithm of (Gerchberg and
Saxton 1972) and (Fienup 1982) is still the popular choice for solving many
variants of this problem. The algorithm is based on alternating minimization;
i.e. it alternates between estimating the missing phase information, and the
candidate solution. Despite its wide usage in practice, no global convergence
guarantees for this algorithm are known. In this paper, we show that a
(resampling) variant of this approach converges geometrically to the solution
of one such problem -- finding a vector $\mathbf{x}$ from
$\mathbf{y},\mathbf{A}$, where $\mathbf{y} =
\left|\mathbf{A}^{\top}\mathbf{x}\right|$ and $|\mathbf{z}|$ denotes a vector
of element-wise magnitudes of $\mathbf{z}$ -- under the assumption that
$\mathbf{A}$ is Gaussian.
Empirically, we demonstrate that alternating minimization performs similar to
recently proposed convex techniques for this problem (which are based on
"lifting" to a convex matrix problem) in sample complexity and robustness to
noise. However, it is much more efficient and can scale to large problems.
Analytically, for a resampling version of alternating minimization, we show
geometric convergence to the solution, and sample complexity that is off by log
factors from obvious lower bounds. We also establish close to optimal scaling
for the case when the unknown vector is sparse. Our work represents the first
theoretical guarantee for alternating minimization (albeit with resampling) for
any variant of phase retrieval problems in the non-convex setting.
Authors' comments: Accepted for publication in IEEE Transactions on Signal Processing
Fajwel Fogel, Irène Waldspurger, Alexandre d'Aspremont
We study convex relaxation algorithms for phase retrieval on imaging
problems. We show that structural assumptions on the signal and the
observations, such as sparsity, smoothness or positivity, can be exploited to
both speed-up convergence and improve recovery performance. We detail
experimental results in molecular imaging problems simulated from PDB data.
Authors' comments: Significantly expanded experiments
Stefano Marchesini
I show that the power iteration method applied to the phase retrieval problem
converges under special conditions. One is given the relative phases between
small non-overlapping groups of pixels of a recorded intensity pattern, but no
information on the phase between the groups of pixels. Numerical tests show
that the inverse block iteration recovers the solution in 1 iteration.
Authors' comments: 4 pages, 6 figures
Elena Agliari, Adriano Barra, Andrea De Antoni, Andrea Galluzzi
In this work, we first revise some extensions of the standard Hopfield model in the low storage limit, namely the correlated attractor case and the multitasking case recently introduced by the authors. The former case is based on a modification of the Hebbian prescription, which induces a coupling between consecutive patterns and this effect is tuned by a parameter $a$. In the latter case, dilution is introduced in pattern entries, in such a way that a fraction $d$ of them is blank. Then, we merge these two extensions to obtain a system able to retrieve several patterns in parallel and the quality of retrieval, encoded by the set of Mattis magnetizations ${m^{\mu}}$, is reminiscent of the correlation among patterns. By tuning the parameters $d$ and $a$, qualitatively different outputs emerge, ranging from highly hierarchical, to symmetric. The investigations are accomplished by means of both numerical simulations and statistical mechanics analysis, properly adapting a novel technique originally developed for spin glasses, i.e. the Hamilton-Jacobi interpolation, with excellent agreement. Finally, we show the thermodynamical equivalence of this associative network with a (restricted) Boltzmann machine and study its stochastic dynamics to obtain even a dynamical picture, perfectly consistent with the static scenario earlier discussed.
B V Patel, B B Meshram
With the development of multimedia data types and available bandwidth there
is huge demand of video retrieval systems, as users shift from text based
retrieval systems to content based retrieval systems. Selection of extracted
features play an important role in content based video retrieval regardless of
video attributes being under consideration. These features are intended for
selecting, indexing and ranking according to their potential interest to the
user. Good features selection also allows the time and space costs of the
retrieval process to be reduced. This survey reviews the interesting features
that can be extracted from video data for indexing and retrieval along with
similarity measurement methods. We also identify present research issues in
area of content based video retrieval systems.
Authors' comments: 18 Pages
Eliyahu Osherovich
In this work we consider the problem of reconstruction of a signal from the
magnitude of its Fourier transform, also known as phase retrieval. The problem
arises in many areas of astronomy, crystallography, optics, and coherent
diffraction imaging (CDI). Our main goal is to develop an efficient
reconstruction method based on continuous optimization techniques. Unlike
current reconstruction methods, which are based on alternating projections, our
approach leads to a much faster and more robust method. However, all previous
attempts to employ continuous optimization methods, such as Newton-type
algorithms, to the phase retrieval problem failed. In this work we provide an
explanation for this failure, and based on this explanation we devise a
sufficient condition that allows development of new reconstruction
methods---approximately known Fourier phase. We demonstrate that a rough (up to
$\pi/2$ radians) Fourier phase estimate practically guarantees successful
reconstruction by any reasonable method. We also present a new reconstruction
method whose reconstruction time is orders of magnitude faster than that of the
current method-of-choice in phase retrieval---Hybrid Input-Output (HIO).
Moreover, our method is capable of successful reconstruction even in the
situations where HIO is known to fail. We also extended our method to other
applications: Fourier domain holography, and interferometry. Additionally we
developed a new sparsity-based method for sub-wavelength CDI. Using this method
we demonstrated experimental resolution exceeding several times the physical
limit imposed by the diffraction light properties (so called diffraction
limit).
Authors' comments: PhD. Thesis
Emmanuel J. Candes, Yonina Eldar, Thomas Strohmer, Vlad Voroninski
This paper develops a novel framework for phase retrieval, a problem which arises in X-ray crystallography, diffraction imaging, astronomical imaging and many other applications. Our approach combines multiple structured illuminations together with ideas from convex programming to recover the phase from intensity measurements, typically from the modulus of the diffracted wave. We demonstrate empirically that any complex-valued object can be recovered from the knowledge of the magnitude of just a few diffracted patterns by solving a simple convex optimization problem inspired by the recent literature on matrix completion. More importantly, we also demonstrate that our noise-aware algorithms are stable in the sense that the reconstruction degrades gracefully as the signal-to-noise ratio decreases. Finally, we introduce some theory showing that one can design very simple structured illumination patterns such that three diffracted figures uniquely determine the phase of the object we wish to recover.
Venkata Ravinder Paruchuri
It is known that humans can easily read words where the letters have been
jumbled in a certain way. This paper examines this problem by associating a
distance measure with the jumbling process. Modifications to text were
generated according to the Damerau-Levenshtein distance and it was checked if
the users are able to read it. Graphical representations of the results are
provided.
Authors' comments: 11 pages
A. Aharony, S. Gurvitz, O. Entin-Wohlman, S. Dattagupta
The time evolution of a qubit, consisting of two single-level quantum dots,
is studied in the presence of telegraph noise. The dots are connected by two
tunneling paths, with an Aharonov-Bohm flux enclosed between them. Under
special symmetry conditions, which can be achieved by tuning gate voltages,
there develops partial decoherence: at long times, the off-diagonal element of
the reduced density matrix (in the basis of the two dot states) approaches a
non-zero value, generating a circulating current around the loop. The flux
dependence of this current contains full information on the initial quantum
state of the qubit, even at infinite time. Small deviations from this symmetry
yield a very slow exponential decay towards the fully-decoherent limit.
However, the amplitudes of this decay also contain the full information on the
initial qubit state, measurable either via the current or via the occupations
of the qubit dots.
Authors' comments: 10 pages, 4 figures