Deng Cai, Yan Wang, Victoria Bi, Zhaopeng Tu, Xiaojiang Liu, Wai Lam, Shuming Shi
For dialogue response generation, traditional generative models generate
responses solely from input queries. Such models rely on insufficient
information for generating a specific response since a certain query could be
answered in multiple ways. Consequentially, those models tend to output generic
and dull responses, impeding the generation of informative utterances.
Recently, researchers have attempted to fill the information gap by exploiting
information retrieval techniques. When generating a response for a current
query, similar dialogues retrieved from the entire training data are considered
as an additional knowledge source. While this may harvest massive information,
the generative models could be overwhelmed, leading to undesirable performance.
In this paper, we propose a new framework which exploits retrieval results via
a skeleton-then-response paradigm. At first, a skeleton is generated by
revising the retrieved responses. Then, a novel generative model uses both the
generated skeleton and the original query for response generation. Experimental
results show that our approaches significantly improve the diversity and
informativeness of the generated responses.
Authors' comments: accepted to NAACL2019
Tatiana Latychevskaia
In this work, issues in phase retrieval in the coherent diffractive imaging (CDI) technique, from discussion on parameters for setting up a CDI experiment to evaluation of the goodness of the final reconstruction, are discussed. The distribution of objects under study by CDI often cannot be cross-validated by another imaging technique. It is therefore important to make sure that the developed CDI procedure delivers an artifact-free object reconstruction. Critical issues that can lead to artifacts are presented and recipes on how to avoid them are provided.
Michal Sedlák, Alessandro Bisio, Mário Ziman
We address the question of a quantum memory storage of quantum dynamics. In
particular, we design an optimal protocol for $N\to 1$ probabilistic
storage-and-retrieval of unitary channels on $d$-dimensional quantum systems.
If we may access the unknown unitary gate only $N$-times, the optimal success
probability of perfect retrieval of its single use is $N/(N-1+d^2)$. The
derived size of the memory system exponentially improves the known upper bound
on the size of the program register needed for probabilistic programmable
quantum processors. Our results are closely related to probabilistic perfect
alignment of reference frames and probabilistic port-based teleportation.
Authors' comments: 5+8 pages, 4 figures
Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
Dialogue systems are usually built on either generation-based or retrieval-based approaches, yet they do not benefit from the advantages of different models. In this paper, we propose a Retrieval-Enhanced Adversarial Training (REAT) method for neural response generation. Distinct from existing approaches, the REAT method leverages an encoder-decoder framework in terms of an adversarial training paradigm, while taking advantage of N-best response candidates from a retrieval-based system to construct the discriminator. An empirical study on a large scale public available benchmark dataset shows that the REAT method significantly outperforms the vanilla Seq2Seq model as well as the conventional adversarial training approach.
Wengu Chen, Peng Li, Qiyu Sun
In this paper, we consider compressive/sparse affine phase retrieval proposed
in [B. Gao B, Q. Sun, Y. Wang and Z. Xu, Adv. in Appl. Math., 93(2018),
121-141]. By the lift technique, and heuristic nuclear norm for convex
relaxation of rank and $\ell$ one norm convex relaxation of sparsity, we
establish convex models , which are called compressive affine phase retrieval
via lifting (CAPRL). In order to compute these models, we develop inertial
proximal ADMM for multiple separated operators and also give out its
convergence analysis. Our numerical experiments via proposed algorithm show
that sparse signal can be exactly and stably recovered via CAPRL. We also list
some other applications of our proposed algorithm.
Authors' comments: This manuscript also needs extensively modifyied
Björn Barz, Christoph Käding, Joachim Denzler
We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode
active learning method for binary classification, and apply it for acquiring
meaningful user feedback in the context of content-based image retrieval.
Instead of combining different heuristics such as uncertainty, diversity, or
density, our method is based on maximizing the mutual information between the
predicted relevance of the images and the expected user feedback regarding the
selected batch. We propose suitable approximations to this computationally
demanding problem and also integrate an explicit model of user behavior that
accounts for possible incorrect labels and unnameable instances. Furthermore,
our approach does not only take the structure of the data but also the expected
model output change caused by the user feedback into account. In contrast to
other methods, ITAL turns out to be highly flexible and provides
state-of-the-art performance across various datasets, such as MIRFLICKR and
ImageNet.
Authors' comments: GCPR 2018 paper (14 pages text + 2 pages references + 6 pages
appendix)
Zhedong Zheng, Liang Zheng, Yi Yang, Fei Wu
Most existing works of adversarial samples focus on attacking image
recognition models, while little attention is paid to the image retrieval task.
In this paper, we identify two inherent challenges in applying prevailing image
recognition attack methods to image retrieval. First, image retrieval demands
discriminative visual features, which is significantly different from the
one-hot class prediction in image recognition. Second, due to the disjoint and
potentially unrelated classes between the training and test set in image
retrieval, predicting the query category from predefined training classes is
not accurate and leads to a sub-optimal adversarial gradient. To address these
limitations, we propose a new white-box attack approach, Opposite-Direction
Feature Attack (ODFA), to generate adversarial queries. Opposite-Direction
Feature Attack (ODFA) effectively exploits feature-level adversarial gradients
and takes advantage of feature distance in the representation space. To our
knowledge, we are among the early attempts to design an attack method
specifically for image retrieval. When we deploy an attacked image as the
query, the true matches are prone to receive low ranks. We demonstrate through
extensive experiments that (1) only crafting adversarial queries is sufficient
to fool the state-of-the-art retrieval systems; (2) the proposed attack method,
ODFA, leads to a higher attack success rate than classification attack methods,
validating the necessity of leveraging characteristics of image retrieval; (3)
the adversarial queries generated by our method have good transferability to
other retrieval models without accessing their parameters, i.e.,the black-box
setting.
Authors' comments: 12 pages, 9 figures, 3 tables
Lin Zhang, Xv Li, Tingting Xue
A new collective behavior of resonant synchronization is discovered and the
ability to retrieve information from brain memory is proposed based on this
mechanism. We use modified Kuramoto phase oscillator to simulate the dynamics
of a single neuron in self-oscillation state, and investigate the collective
responses of a neural network, which is composed of $N$ globally coupled
Kuramoto oscillators, to the external stimulus signals in a critical state just
below the synchronization threshold of Kuramoto model. The input signals at
different driving frequencies, which are used to denote different neural
stimuli, can drive the coupled oscillators into different synchronized groups
locked to the same effective frequencies and recover different synchronized
patterns emerged from their collective dynamics closely related to the
predetermined frequency distributions of the oscillators (memory). This model
is used to explain how brain stores and retrieves information by the
synchronized patterns emerging in the neural network stimulated by the external
inputs.
Authors' comments: 7 pages, 5 figures
Jukka Ruohonen, Ville Leppänen
This paper presents a preliminary validation of common textual information
retrieval techniques for mapping unstructured software vulnerability
information to distinct software weaknesses. The validation is carried out with
a dataset compiled from four software repositories tracked in the Snyk
vulnerability database. According to the results, the information retrieval
techniques used perform unsatisfactorily compared to regular expression
searches. Although the results vary from a repository to another, the
preliminary validation presented indicates that explicit referencing of
vulnerability and weakness identifiers is preferable for concrete vulnerability
tracking. Such referencing allows the use of keyword-based searches, which
currently seem to yield more consistent results compared to information
retrieval techniques. Further validation work is required for improving the
precision of the techniques, however.
Authors' comments: Proceedings of the 29th International Conference on Database and
Expert Systems Applications (DEXA 2018), Regensburg, Springer, pp.~265--277
R. Y. Teh, S. Kiesewetter, P. D. Drummond, M. D. Reid
We analyze a method for the creation, storage and retrieval of optomechanical Schrodinger cat states, in which there is a quantum superposition of two distinct macroscopic states of a mechanical oscillator. In the proposal, an optical cat state is first prepared in an optical cavity, then transferred to the mechanical mode, where it is stored and later retrieved using control fields. We carry out numerical simulations for the quantum memory protocol for optomechanical cat states using the positive-P phase space representation. This has a compact, positive representation for a cat state, thus allowing a probabilistic simulation of this highly non-classical quantum system. To verify the effectiveness of the cat-state quantum memory, we consider several cat-state signatures and show how they can be computed. We also investigate the effects of decoherence on a cat state by solving the standard master equation for a simplified model analytically, allowing us to compare with the numerical results. Focusing on the negativity of the Wigner function as a signature of the cat state, we evaluate analytically an upper bound on the time taken for the negativity to vanish, for a given temperature of the environment of the mechanical oscillator. We show consistency with the numerical methods. These provide exact solutions, allowing a full treatment of decoherence in an experiment that involves creating, storing and retrieving mechanical cat states using temporally mode-matched input and output pulses. Our analysis treats the internal optical and mechanical modes of an optomechanical oscillator, and the complete set of input and output field modes which become entangled with the internal modes. The model includes decoherence due to thermal effects in the mechanical reservoirs, as well as optical and mechanical losses.
Petr Sojka, Michal Růžička, Vít Novotný
Digital mathematical libraries (DMLs) such as arXiv, Numdam, and EuDML
contain mainly documents from STEM fields, where mathematical formulae are
often more important than text for understanding. Conventional information
retrieval (IR) systems are unable to represent formulae and they are therefore
ill-suited for math information retrieval (MIR). To fill the gap, we have
developed, and open-sourced the MIaS MIR system. MIaS is based on the full-text
search engine Apache Lucene. On top of text retrieval, MIaS also incorporates a
set of tools for preprocessing mathematical formulae. We describe the design of
the system and present speed, and quality evaluation results. We show that MIaS
is both efficient, and effective, as evidenced by our victory in the NTCIR-11
Math-2 task.
Authors' comments: This is the author's version of the work. It is posted here for your
personal use. Not for redistribution. The definitive Version of Record was
published in The 27th ACM International Conference on Information and
Knowledge Management (CIKM '18), October 22-26, 2018, Torino, Italy,
https://doi.org/10.1145/3269206.3269233
Niluthpol Chowdhury Mithun, Rameswar Panda, Evangelos E. Papalexakis, Amit K. Roy-Chowdhury
Cross-modal retrieval between visual data and natural language description
remains a long-standing challenge in multimedia. While recent image-text
retrieval methods offer great promise by learning deep representations aligned
across modalities, most of these methods are plagued by the issue of training
with small-scale datasets covering a limited number of images with ground-truth
sentences. Moreover, it is extremely expensive to create a larger dataset by
annotating millions of images with sentences and may lead to a biased model.
Inspired by the recent success of webly supervised learning in deep neural
networks, we capitalize on readily-available web images with noisy annotations
to learn robust image-text joint representation. Specifically, our main idea is
to leverage web images and corresponding tags, along with fully annotated
datasets, in training for learning the visual-semantic joint embedding. We
propose a two-stage approach for the task that can augment a typical supervised
pair-wise ranking loss based formulation with weakly-annotated web images to
learn a more robust visual-semantic embedding. Experiments on two standard
benchmark datasets demonstrate that our method achieves a significant
performance gain in image-text retrieval compared to state-of-the-art
approaches.
Authors' comments: ACM Multimedia 2018
Lei Zhu, Jun Long, Chengyuan Zhang, Ruipeng Chen, Xinpan Yuan, Zhan Yang
Due to the rapid development of mobile Internet techniques, cloud computation
and popularity of online social networking and location-based services, massive
amount of multimedia data with geographical information is generated and
uploaded to the Internet. In this paper, we propose a novel type of cross-modal
multimedia retrieval called geo-multimedia cross-modal retrieval which aims to
search out a set of geo-multimedia objects based on geographical distance
proximity and semantic similarity between different modalities. Previous
studies for cross-modal retrieval and spatial keyword search cannot address
this problem effectively because they do not consider multimedia data with
geo-tags and do not focus on this type of query. In order to address this
problem efficiently, we present the definition of $k$NN geo-multimedia
cross-modal query at the first time and introduce relevant conceptions such as
cross-modal semantic representation space. To bridge the semantic gap between
different modalities, we propose a method named cross-modal semantic matching
which contains two important component, i.e., CorrProj and LogsTran, which aims
to construct a common semantic representation space for cross-modal semantic
similarity measurement. Besides, we designed a framework based on deep learning
techniques to implement common semantic representation space construction. In
addition, a novel hybrid indexing structure named GMR-Tree combining
geo-multimedia data and R-Tree is presented and a efficient $k$NN search
algorithm called $k$GMCMS is designed. Comprehensive experimental evaluation on
real and synthetic dataset clearly demonstrates that our solution outperforms
the-state-of-the-art methods.
Authors' comments: 27 pages
Su Li, Michael Gastpar
We study the problem of single-server multi-message private information
retrieval with side information. One user wants to recover $N$ out of $K$
independent messages which are stored at a single server. The user initially
possesses a subset of $M$ messages as side information. The goal of the user is
to download the $N$ demand messages while not leaking any information about the
indices of these messages to the server. In this paper, we characterize the
minimum number of required transmissions. We also present the optimal linear
coding scheme which enables the user to download the demand messages and
preserves the privacy of their indices. Moreover, we show that the trivial MDS
coding scheme with $K-M$ transmissions is optimal if $N>M$ or $N^2+N \ge K-M$.
This means if one wishes to privately download more than the square-root of the
number of files in the database, then one must effectively download the full
database (minus the side information), irrespective of the amount of side
information one has available.
Authors' comments: 12 pages, submitted to the 56th Allerton conference
Somdip Dey, Asoke Nath, Shalabh Agarwal
Now, security and authenticity of data is a big challenge. To solve this problem, we propose an innovative method to authenticate the digital documents. In this paper, we propose a new method, where the marks obtained by a candidate will also be encoded in QR CodeTM in encrypted form, so that if an intruder tries to change the marks in the mark sheet then he can not do that in the QR CodeTM, because the encryption key is unknown to him. In this method, we encrypt the mark sheet data using the TTJSA encryption algorithm. The encrypted marks are entered inside QR code and that QR code is also printed with the original data of the mark sheet. The marks can then be retrieved from the QR code and can be decrypted using TTJSA decryption algorithm and then it can be verified with marks already there in the mark sheet.
Fahad Shamshad, Ali Ahmed
This paper proposes a new framework to regularize the highly ill-posed and
non-linear phase retrieval problem through deep generative priors using simple
gradient descent algorithm. We experimentally show effectiveness of proposed
algorithm for random Gaussian measurements (practically relevant in imaging
through scattering media) and Fourier friendly measurements (relevant in
optical set ups). We demonstrate that proposed approach achieves impressive
results when compared with traditional hand engineered priors including
sparsity and denoising frameworks for number of measurements and robustness
against noise. Finally, we show the effectiveness of the proposed approach on a
real transmission matrix dataset in an actual application of multiple
scattering media imaging.
Authors' comments: Preprint. Work in progress
Leulseged Tesfaye Alemu, Marcello Pelillo
Aggregating different image features for image retrieval has recently shown its effectiveness. While highly effective, though, the question of how to uplift the impact of the best features for a specific query image persists as an open computer vision problem. In this paper, we propose a computationally efficient approach to fuse several hand-crafted and deep features, based on the probabilistic distribution of a given membership score of a constrained cluster in an unsupervised manner. First, we introduce an incremental nearest neighbor (NN) selection method, whereby we dynamically select k-NN to the query. We then build several graphs from the obtained NN sets and employ constrained dominant sets (CDS) on each graph G to assign edge weights which consider the intrinsic manifold structure of the graph, and detect false matches to the query. Finally, we elaborate the computation of feature positive-impact weight (PIW) based on the dispersive degree of the characteristics vector. To this end, we exploit the entropy of a cluster membership-score distribution. In addition, the final NN set bypasses a heuristic voting scheme. Experiments on several retrieval benchmark datasets show that our method can improve the state-of-the-art result.
Biel Roig-Solvas, Lee Makowski, Dana H. Brooks
Proximal algorithms have gained popularity in recent years in large-scale and distributed optimization problems. One such problem is the phase retrieval problem, for which proximal operators have been proposed recently. The phase retrieval problem commonly refers to the task of recovering a target signal based on the magnitude of linear projections of that signal onto known vectors, usually under the presence of noise. A more general problem is the multispectral phase retrieval problem, where sums of these magnitudes are observed instead. In this paper we study the proximal operator for this problem, which appears in applications like X-ray solution scattering. We show that despite its non-convexity, all local minimizers are global minimizers, guaranteeing the optimality of simple descent techniques. An efficient linear time exact Newton method is proposed based on the structure of the problem's Hessian. Initialization criteria are discussed and the computational performance of the proposed algorithm is compared to that of traditional descent methods. The studied proximal operator can be used in a distributed and parallel scenarios using an ADMM scheme and allows for exploiting the spectral characteristics of the problem's measurement matrices, known in many physical sensing applications, in a way that is not possible with non-splitted optimization algorithms. The dependency of the proximal operator on the rank of these matrices, instead of their dimension, can greatly reduce the memory and computation requirements for problems of moderate to large size (N>10000) when these measurement matrices admit a low-rank representation.
Jason Weston, Emily Dinan, Alexander H. Miller
Sequence generation models for dialogue are known to have several problems: they tend to produce short, generic sentences that are uninformative and unengaging. Retrieval models on the other hand can surface interesting responses, but are restricted to the given retrieval set leading to erroneous replies that cannot be tuned to the specific context. In this work we develop a model that combines the two approaches to avoid both their deficiencies: first retrieve a response and then refine it -- the final sequence generator treating the retrieval as additional context. We show on the recent CONVAI2 challenge task our approach produces responses superior to both standard retrieval and generation models in human evaluations.
Sasi Kiran Yelamarthi, Shiva Krishna Reddy, Ashish Mishra, Anurag Mittal
Sketch-based image retrieval (SBIR) is the task of retrieving images from a
natural image database that correspond to a given hand-drawn sketch. Ideally,
an SBIR model should learn to associate components in the sketch (say, feet,
tail, etc.) with the corresponding components in the image having similar shape
characteristics. However, current evaluation methods simply focus only on
coarse-grained evaluation where the focus is on retrieving images which belong
to the same class as the sketch but not necessarily having the same shape
characteristics as in the sketch. As a result, existing methods simply learn to
associate sketches with classes seen during training and hence fail to
generalize to unseen classes. In this paper, we propose a new benchmark for
zero-shot SBIR where the model is evaluated in novel classes that are not seen
during training. We show through extensive experiments that existing models for
SBIR that are trained in a discriminative setting learn only class specific
mappings and fail to generalize to the proposed zero-shot setting. To
circumvent this, we propose a generative approach for the SBIR task by
proposing deep conditional generative models that take the sketch as an input
and fill the missing information stochastically. Experiments on this new
benchmark created from the "Sketchy" dataset, which is a large-scale database
of sketch-photo pairs demonstrate that the performance of these generative
models is significantly better than several state-of-the-art approaches in the
proposed zero-shot framework of the coarse-grained SBIR task.
Authors' comments: Accepted in ECCV 2018, Munich Germany