Wen-Jun Zeng, H. C. So
Phase retrieval aims at recovering a complex-valued signal from magnitude-only measurements, which attracts much attention since it has numerous applications in many disciplines. However, phase recovery involves solving a system of quadratic equations, indicating that it is a challenging nonconvex optimization problem. To tackle phase retrieval in an effective and efficient manner, we apply coordinate descent (CD) such that a single unknown is solved at each iteration while all other variables are kept fixed. As a result, only minimization of a univariate quartic polynomial is needed which is easily achieved by finding the closed-form roots of a cubic equation. Three computationally simple algorithms referred to as cyclic, randomized and greedy CDs, based on different updating rules, are devised. It is proved that the three CDs globally converge to a stationary point of the nonconvex problem, and specifically, the randomized CD locally converges to the global minimum and attains exact recovery at a geometric rate with high probability if the sample size is large enough. The cyclic and randomized CDs are also modified via minimization of the $\ell_1$-regularized quartic polynomial for phase retrieval of sparse signals. Furthermore, a novel application of the three CDs, namely, blind equalization in digital communications, is proposed. It is demonstrated that the CD methodology is superior to the state-of-the-art techniques in terms of computational efficiency and/or recovery performance.
Zhao Yan, Duyu Tang, Nan Duan, Junwei Bao, Yuanhua Lv, Ming Zhou, Zhoujun Li
Understanding the connections between unstructured text and semi-structured table is an important yet neglected problem in natural language processing. In this work, we focus on content-based table retrieval. Given a query, the task is to find the most relevant table from a collection of tables. Further progress towards improving this area requires powerful models of semantic matching and richer training and evaluation resources. To remedy this, we present a ranking based approach, and implement both carefully designed features and neural network architectures to measure the relevance between a query and the content of a table. Furthermore, we release an open-domain dataset that includes 21,113 web queries for 273,816 tables. We conduct comprehensive experiments on both real world and synthetic datasets. Results verify the effectiveness of our approach and present the challenges for this task.
Tamir Bendory, Robert Beinert, Yonina C. Eldar
The problem of recovering a signal from its phaseless Fourier transform measurements, called Fourier phase retrieval, arises in many applications in engineering and science. Fourier phase retrieval poses fundamental theoretical and algorithmic challenges. In general, there is no unique mapping between a one-dimensional signal and its Fourier magnitude and therefore the problem is ill-posed. Additionally, while almost all multidimensional signals are uniquely mapped to their Fourier magnitude, the performance of existing algorithms is generally not well-understood. In this chapter we survey methods to guarantee uniqueness in Fourier phase retrieval. We then present different algorithmic approaches to retrieve the signal in practice. We conclude by outlining some of the main open questions in this field.
Tao Zhou, Muhao Chen, Jie Yu, Demetri Terzopoulos
Following the recent progress in image classification and captioning using
deep learning, we develop a novel natural language person retrieval system
based on an attention mechanism. More specifically, given the description of a
person, the goal is to localize the person in an image. To this end, we first
construct a benchmark dataset for natural language person retrieval. To do so,
we generate bounding boxes for persons in a public image dataset from the
segmentation masks, which are then annotated with descriptions and attributes
using the Amazon Mechanical Turk. We then adopt a region proposal network in
Faster R-CNN as a candidate region generator. The cropped images based on the
region proposals as well as the whole images with attention weights are fed
into Convolutional Neural Networks for visual feature extraction, while the
natural language expression and attributes are input to Bidirectional Long
Short- Term Memory (BLSTM) models for text feature extraction. The visual and
text features are integrated to score region proposals, and the one with the
highest score is retrieved as the output of our system. The experimental
results show significant improvement over the state-of-the-art method for
generic object retrieval and this line of research promises to benefit search
in surveillance video footage.
Authors' comments: CVPR 2017 Workshop (vision meets cognition)
Bálint Zoltán Daróczy
In this thesis we examined several multimodal feature extraction and learning
methods for retrieval and classification purposes. We reread briefly some
theoretical results of learning in Section 2 and reviewed several generative
and discriminative models in Section 3 while we described the similarity kernel
in Section 4. We examined different aspects of the multimodal image retrieval
and classification in Section 5 and suggested methods for identifying quality
assessments of Web documents in Section 6. In our last problem we proposed
similarity kernel for time-series based classification. The experiments were
carried over publicly available datasets and source codes for the most
essential parts are either open source or released. Since the used similarity
graphs (Section 4.2) are greatly constrained for computational purposes, we
would like to continue work with more complex, evolving and capable graphs and
apply for different problems such as capturing the rapid change in the
distribution (e.g. session based recommendation) or complex graphs of the
literature work. The similarity kernel with the proper metrics reaches and in
many cases improves over the state-of-the-art. Hence we may conclude generative
models based on instance similarities with multiple modes is a generally
applicable model for classification and regression tasks ranging over various
domains, including but not limited to the ones presented in this thesis. More
generally, the Fisher kernel is not only unique in many ways but one of the
most powerful kernel functions. Therefore we may exploit the Fisher kernel in
the future over widely used generative models, such as Boltzmann Machines
[Hinton et al., 1984], a particular subset, the Restricted Boltzmann Machines
and Deep Belief Networks [Hinton et al., 2006]), Latent Dirichlet Allocation
[Blei et al., 2003] or Hidden Markov Models [Baum and Petrie, 1966] to name a
few.
Authors' comments: doctoral thesis, 2016
Ziyang Yuan, Qi Wang, Hongxia Wang
Phase retrieval(PR) problem is a kind of ill-condition inverse problem which can be found in various of applications. Utilizing the sparse priority, an algorithm called SWF(Sparse Wirtinger Flow) is proposed in this paper to deal with sparse PR problem based on the Wirtinger flow method. SWF firstly recovers the support of the signal and then updates the evaluation by hard thresholding method with an elaborate initialization. Theoretical analyses show that SWF has a geometric convergence for any $k$ sparse $n$ length signal with the sampling complexity $\mathcal{O}(k^2\mathrm{log}n)$. To get $\varepsilon$ accuracy, the computational complexity of SWF is $\mathcal{O}(k^3n\mathrm{log}n\mathrm{log}\frac{1}{\varepsilon})$. Numerical tests also demonstrate that SWF performs better than state-of-the-art methods especially when we have no priori knowledge about sparsity $k$. Moreover, SWF is also robust to the noise
Zhifang Zhang, Jingke Xu
Suppose there are $N$ distributed databases each storing a full set of $M$
independent files. A user wants to retrieve $r$ out of the $M$ files without
revealing the identity of the $r$ files. When $r=1$ it is the classic problem
of private information retrieval (PIR). In this paper we study the problem of
private multi-file retrieval (PMFR) which covers the case of general $r$. We
first prove an upper bound on the capacity of PMFR schemes which indicates the
minimum possible download size per unit of retrieved files. Then we design a
general PMFR scheme which happens to attain the upper bound when
$r\geq\frac{M}{2}$, thus achieving the optimal communication cost. As $r$ goes
down we show the trivial approach of executing $r$ independent PIR instances
achieves the near optimal communication cost. Comparing with the
capacity-achieving PIR schemes, our PMFR scheme reduces the number of
subpackages needed for each file from $N^M$ to $N^2$, which implies a great
reduction of implementation complexity.
Authors' comments: This paper has been covered by another paper
Besnik Fetahu, Ujwal Gadiraju, Stefan Dietze
The increasing amount of data on the Web, in particular of Linked Data, has led to a diverse landscape of datasets, which make entity retrieval a challenging task. Explicit cross-dataset links, for instance to indicate co-references or related entities can significantly improve entity retrieval. However, only a small fraction of entities are interlinked through explicit statements. In this paper, we propose a two-fold entity retrieval approach. In a first, offline preprocessing step, we cluster entities based on the \emph{x--means} and \emph{spectral} clustering algorithms. In the second step, we propose an optimized retrieval model which takes advantage of our precomputed clusters. For a given set of entities retrieved by the BM25F retrieval approach and a given user query, we further expand the result set with relevant entities by considering features of the queries, entities and the precomputed clusters. Finally, we re-rank the expanded result set with respect to the relevance to the query. We perform a thorough experimental evaluation on the Billions Triple Challenge (BTC12) dataset. The proposed approach shows significant improvements compared to the baseline and state of the art approaches.
Shanshan Huang, Yichao Xiong, Ya Zhang, Jia Wang
Hashing has played a pivotal role in large-scale image retrieval. With the development of Convolutional Neural Network (CNN), hashing learning has shown great promise. But existing methods are mostly tuned for classification, which are not optimized for retrieval tasks, especially for instance-level retrieval. In this study, we propose a novel hashing method for large-scale image retrieval. Considering the difficulty in obtaining labeled datasets for image retrieval task in large scale, we propose a novel CNN-based unsupervised hashing method, namely Unsupervised Triplet Hashing (UTH). The unsupervised hashing network is designed under the following three principles: 1) more discriminative representations for image retrieval; 2) minimum quantization loss between the original real-valued feature descriptors and the learned hash codes; 3) maximum information entropy for the learned hash codes. Extensive experiments on CIFAR-10, MNIST and In-shop datasets have shown that UTH outperforms several state-of-the-art unsupervised hashing methods in terms of retrieval accuracy.
Edouard Pauwels, Amir Beck, Yonina C. Eldar, Shoham Sabach
Alternating minimization, or Fienup methods, have a long history in phase retrieval. We provide new insights related to the empirical and theoretical analysis of these algorithms when used with Fourier measurements and combined with convex priors. In particular, we show that Fienup methods can be viewed as performing alternating minimization on a regularized nonconvex least-squares problem with respect to amplitude measurements. We then prove that under mild additional structural assumptions on the prior (semi-algebraicity), the sequence of signal estimates has a smooth convergent behaviour towards a critical point of the nonconvex regularized least-squares objective. Finally, we propose an extension to Fienup techniques, based on a projected gradient descent interpretation and acceleration using inertial terms. We demonstrate experimentally that this modification combined with an $\ell_1$ prior constitutes a competitive approach for sparse phase retrieval.
Y Qian, E Vazquez, B Sengupta
Comparing images to recommend items from an image-inventory is a subject of
continued interest. Added with the scalability of deep-learning architectures
the once `manual' job of hand-crafting features have been largely alleviated,
and images can be compared according to features generated from a deep
convolutional neural network. In this paper, we compare distance metrics (and
divergences) to rank features generated from a neural network, for
content-based image retrieval. Specifically, after modelling individual images
using approximations of mixture models or sparse covariance estimators, we
resort to their information-theoretic and Riemann geometric comparisons. We
show that using approximations of mixture models enable us to compute a
distance measure based on the Wasserstein metric that requires less effort than
other computationally intensive optimal transport plans; finally, an affine
invariant metric is used to compare the optimal transport metric to its Riemann
geometric counterpart -- we conclude that although expensive, retrieval metric
based on Wasserstein geometry is more suitable than information theoretic
comparison of images. In short, we combine GPU scalability in learning deep
feature vectors with statistically efficient metrics that we foresee being
utilised in a commercial setting.
Authors' comments: 5th ICDM Workshop on High Dimensional Data Mining (HDM 2017)
Eduardo X. Miqueles, Nathaly L. Archilha, Marcelo R. Dos Anjos, Harry Westfahl, Elias S. Helou
It was recently shown that the phase retrieval imaging of a sample can be modeled as a simple convolution process. Sometimes, such a convolution depends on physical parameters of the sample which are difficult to estimate a priori. In this case, a blind choice for those parameters usually lead to wrong results, e.g., in posterior image segmentation processing. In this manuscript, we propose a simple connection between phase-retrieval algorithms and optimization strategies, which lead us to ways of numerically determining the physical parameters
Xue Jiang, H. C. So, X. Liu
An outlier-resistance phase retrieval algorithm based on alternating direction method of multipliers (ADMM) is devised in this letter. Instead of the widely used least squares criterion that is only optimal for Gaussian noise environment, we adopt the least absolute deviation criterion to enhance the robustness against outliers. Considering both intensity- and amplitude-based observation models, the framework of ADMM is developed to solve the resulting non-differentiable optimization problems. It is demonstrated that the core subproblem of ADMM is the proximity operator of the L1-norm, which can be computed efficiently by soft-thresholding in each iteration. Simulation results are provided to validate the accuracy and efficiency of the proposed approach compared to the existing schemes.
Ziyang Yuan, Hongxia Wang
Phase retrieval(PR) problem is a kind of ill-condition inverse problem which
is arising in various of applications. Based on the Wirtinger flow(WF) method,
a reweighted Wirtinger flow(RWF) method is proposed to deal with PR problem.
RWF finds the global optimum by solving a series of sub-PR problems with
changing weights. Theoretical analyses illustrate that the RWF has a geometric
convergence from a deliberate initialization when the weights are bounded by 1
and $\frac{10}{9}$. Numerical testing shows RWF has a lower sampling complexity
compared with WF. As an essentially adaptive truncated Wirtinger flow(TWF)
method, RWF performs better than TWF especially when the ratio between sampling
number $m$ and length of signal $n$ is small.
Authors' comments: 21 pages, 11 figures
Sara Botello-Andrade, Peter G. Casazza, Dorsa Ghoreishi, Shani Jose, Janet C. Tremain
Phase retrieval and phaseless reconstruction for Hilbert space frames is a very active area of research. Recently, it was shown that these concepts are equivalent. In this paper, we make a detailed study of a weakening of these concepts to weak phase retrieval and weak phaseless reconstruction. We will give several necessary and/or sufficient conditions for frames to have these weak properties. We will prove three surprising results: (1) Weak phaseless reconstruction is equivalent to phaseless reconstruction. I.e. It never was "weak"; (2) Weak phase retrieval is not equivalent to weak phaseless reconstruction; (3) Weak phase retrieval requires at least $2m-2$ vectors in an m-dimensional Hilbert space. We also gives several examples illustrating the relationship between these concepts.
Paul Hand, Vladislav Voroninski
We consider the problem of phase retrieval from corrupted magnitude observations. In particular we show that a fixed $x_0 \in \mathbb{R}^n$ can be recovered exactly from corrupted magnitude measurements $|\langle a_i, x_0 \rangle | + \eta_i, \quad i =1,2\ldots m$ with high probability for $m = O(n)$, where $a_i \in \mathbb{R}^n$ are i.i.d standard Gaussian and $\eta \in \mathbb{R}^m$ has fixed sparse support and is otherwise arbitrary, by using a version of the PhaseMax algorithm augmented with slack variables subject to a penalty. This linear programming formulation, which we call RobustPhaseMax, operates in the natural parameter space, and our proofs rely on a direct analysis of the optimality conditions using concentration inequalities.
Gang Wang, Liang Zhang, Georgios B. Giannakis, Mehmet Akcakaya, Jie Chen
This paper develops a novel algorithm, termed \emph{SPARse Truncated
Amplitude flow} (SPARTA), to reconstruct a sparse signal from a small number of
magnitude-only measurements. It deals with what is also known as sparse phase
retrieval (PR), which is \emph{NP-hard} in general and emerges in many science
and engineering applications. Upon formulating sparse PR as an amplitude-based
nonconvex optimization task, SPARTA works iteratively in two stages: In stage
one, the support of the underlying sparse signal is recovered using an
analytically well-justified rule, and subsequently, a sparse
orthogonality-promoting initialization is obtained via power iterations
restricted on the support; and, in the second stage, the initialization is
successively refined by means of hard thresholding based gradient-type
iterations. SPARTA is a simple yet effective, scalable, and fast sparse PR
solver. On the theoretical side, for any $n$-dimensional $k$-sparse ($k\ll n$)
signal $\bm{x}$ with minimum (in modulus) nonzero entries on the order of
$(1/\sqrt{k})\|\bm{x}\|_2$, SPARTA recovers the signal exactly (up to a global
unimodular constant) from about $k^2\log n$ random Gaussian measurements with
high probability. Furthermore, SPARTA incurs computational complexity on the
order of $k^2n\log n$ with total runtime proportional to the time required to
read the data, which improves upon the state-of-the-art by at least a factor of
$k$. Finally, SPARTA is robust against additive noise of bounded support.
Extensive numerical tests corroborate markedly improved recovery performance
and speedups of SPARTA relative to existing alternatives.
Authors' comments: 23 pages; 5 figures
Ye Zhang, Md Mustafizur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert et al.
A recent "third wave" of Neural Network (NN) approaches now delivers
state-of-the-art performance in many machine learning tasks, spanning speech
recognition, computer vision, and natural language processing. Because these
modern NNs often comprise multiple interconnected layers, this new NN research
is often referred to as deep learning. Stemming from this tide of NN work, a
number of researchers have recently begun to investigate NN approaches to
Information Retrieval (IR). While deep NNs have yet to achieve the same level
of success in IR as seen in other areas, the recent surge of interest and work
in NNs for IR suggest that this state of affairs may be quickly changing. In
this work, we survey the current landscape of Neural IR research, paying
special attention to the use of learned representations of queries and
documents (i.e., neural embeddings). We highlight the successes of neural IR
thus far, catalog obstacles to its wider adoption, and suggest potentially
promising directions for future research.
Authors' comments: 44 pages
Tom Goldstein, Christoph Studer
We consider the recovery of a (real- or complex-valued) signal from magnitude-only measurements, known as phase retrieval. We formulate phase retrieval as a convex optimization problem, which we call PhaseMax. Unlike other convex methods that use semidefinite relaxation and lift the phase retrieval problem to a higher dimension, PhaseMax is a "non-lifting" relaxation that operates in the original signal dimension. We show that the dual problem to PhaseMax is Basis Pursuit, which implies that phase retrieval can be performed using algorithms initially designed for sparse signal recovery. We develop sharp lower bounds on the success probability of PhaseMax for a broad range of random measurement ensembles, and we analyze the impact of measurement noise on the solution accuracy. We use numerical results to demonstrate the accuracy of our recovery guarantees, and we showcase the efficacy and limits of PhaseMax in practice.
Tongfei Chen, Benjamin Van Durme
We propose a framework for discriminative Information Retrieval (IR) atop linguistic features, trained to improve the recall of tasks such as answer candidate passage retrieval, the initial step in text-based Question Answering (QA). We formalize this as an instance of linear feature-based IR (Metzler and Croft, 2007), illustrating how a variety of knowledge discovery tasks are captured under this approach, leading to a 44% improvement in recall for candidate triage for QA.