Jens Dörpinghaus, Andreas Stefan
Knowledge graphs have been shown to play an important role in recent knowledge mining and discovery, for example in the field of life sciences or bioinformatics. Although a lot of research has been done on the field of query optimization, query transformation and of course in storing and retrieving large scale knowledge graphs the field of algorithmic optimization is still a major challenge and a vital factor in using graph databases. Few researchers have addressed the problem of optimizing algorithms on large scale labeled property graphs. Here, we present two optimization approaches and compare them with a naive approach of directly querying the graph database. The aim of our work is to determine limiting factors of graph databases like Neo4j and we describe a novel solution to tackle these challenges. For this, we suggest a classification schema to differ between the complexity of a problem on a graph database. We evaluate our optimization approaches on a test system containing a knowledge graph derived biomedical publication data enriched with text mining data. This dense graph has more than 71M nodes and 850M relationships. The results are very encouraging and - depending on the problem - we were able to show a speedup of a factor between 44 and 3839.
Li Weng, Lingzhi Ye, Jiangmin Tian, Jiuwen Cao, Jianzhong Wang
Image hash algorithms generate compact binary representations that can be
quickly matched by Hamming distance, thus become an efficient solution for
large-scale image retrieval. This paper proposes RV-SSDH, a deep image hash
algorithm that incorporates the classical VLAD (vector of locally aggregated
descriptors) architecture into neural networks. Specifically, a novel neural
network component is formed by coupling a random VLAD layer with a latent hash
layer through a transform layer. This component can be combined with
convolutional layers to realize a hash algorithm. We implement RV-SSDH as a
point-wise algorithm that can be efficiently trained by minimizing
classification error and quantization loss. Comprehensive experiments show this
new architecture significantly outperforms baselines such as NetVLAD and SSDH,
and offers a cost-effective trade-off in the state-of-the-art. In addition, the
proposed random VLAD layer leads to satisfactory accuracy with low complexity,
thus shows promising potentials as an alternative to NetVLAD.
Authors' comments: 10 pages, 17 figures, submitted to IEEE Transactions on Image
Processing
Yanming Che, Clemens Gneiting, Tao Liu, Franco Nori
The discovery of topological features of quantum states plays an important
role in modern condensed matter physics and various artificial systems. Due to
the absence of local order parameters, the detection of topological quantum
phase transitions remains a challenge. Machine learning may provide effective
methods for identifying topological features. In this work, we show that the
unsupervised manifold learning can successfully retrieve topological quantum
phase transitions in momentum and real space. Our results show that the
Chebyshev distance between two data points sharpens the characteristic features
of topological quantum phase transitions in momentum space, while the widely
used Euclidean distance is in general suboptimal. Then a diffusion map or
isometric map can be applied to implement the dimensionality reduction, and to
learn about topological quantum phase transitions in an unsupervised manner. We
demonstrate this method on the prototypical Su-Schrieffer-Heeger (SSH) model,
the Qi-Wu-Zhang (QWZ) model, and the quenched SSH model in momentum space, and
further provide implications and demonstrations for learning in real space,
where the topological invariants could be unknown or hard to compute. The
interpretable good performance of our approach shows the capability of manifold
learning, when equipped with a suitable distance metric, in exploring
topological quantum phase transitions.
Authors' comments: Accepted by Physical Review B (2020); Updated version including
implications and demonstrations for learning in real space, and with
appendices; Typos were corrected; 14 pages, 12 figures
Osman Tursun, Simon Denman, Sridha Sridharan, Clinton Fookes
Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks. However, their invariance to target data is pre-defined by the network architecture and training data. Existing image retrieval approaches require fine-tuning or modification of pre-trained networks to adapt to variations unique to the target data. In contrast, our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented at test-time, with augmentations guided by a policy learned through reinforcement learning. The learned policy assigns different magnitudes and weights to the selected transformations, which are selected from a list of image transformations. Policies are evaluated using a metric learning protocol to learn the optimal policy. The model converges quickly and the cost of each policy iteration is minimal as we propose an off-line caching technique to greatly reduce the computational cost of extracting features from augmented images. Experimental results on large trademark retrieval (METU trademark dataset) and landmark retrieval (ROxford5k and RParis6k scene datasets) tasks show that the learned ensemble of transformations is highly effective for improving performance, and is practical, and transferable.
Sebastian Arnold, Betty van Aken, Paul Grundmann, Felix A. Gers, Alexander Löser
We present Contextual Discourse Vectors (CDV), a distributed document
representation for efficient answer retrieval from long healthcare documents.
Our approach is based on structured query tuples of entities and aspects from
free text and medical taxonomies. Our model leverages a dual encoder
architecture with hierarchical LSTM layers and multi-task training to encode
the position of clinical entities and aspects alongside the document discourse.
We use our continuous representations to resolve queries with short latency
using approximate nearest neighbor search on sentence level. We apply the CDV
model for retrieving coherent answer passages from nine English public health
resources from the Web, addressing both patients and medical professionals.
Because there is no end-to-end training data available for all application
scenarios, we train our model with self-supervised data from Wikipedia. We show
that our generalized model significantly outperforms several state-of-the-art
baselines for healthcare passage ranking and is able to adapt to heterogeneous
domains without additional fine-tuning.
Authors' comments: The Web Conference 2020 (WWW '20)
Chenggang Yan, Biao Gong, Yuxuan Wei, Yue Gao
Hashing is an efficient method for nearest neighbor search in large-scale data space by embedding high-dimensional feature descriptors into a similarity preserving Hamming space with a low dimension. However, large-scale high-speed retrieval through binary code has a certain degree of reduction in retrieval accuracy compared to traditional retrieval methods. We have noticed that multi-view methods can well preserve the diverse characteristics of data. Therefore, we try to introduce the multi-view deep neural network into the hash learning field, and design an efficient and innovative retrieval model, which has achieved a significant improvement in retrieval performance. In this paper, we propose a supervised multi-view hash model which can enhance the multi-view information through neural networks. This is a completely new hash learning method that combines multi-view and deep learning methods. The proposed method utilizes an effective view stability evaluation method to actively explore the relationship among views, which will affect the optimization direction of the entire network. We have also designed a variety of multi-data fusion methods in the Hamming space to preserve the advantages of both convolution and multi-view. In order to avoid excessive computing resources on the enhancement procedure during retrieval, we set up a separate structure called memory network which participates in training together. The proposed method is systematically evaluated on the CIFAR-10, NUS-WIDE and MS-COCO datasets, and the results show that our method significantly outperforms the state-of-the-art single-view and multi-view hashing methods.
Shuo Zhang, Krisztian Balog
Tables are a powerful and popular tool for organizing and manipulating data.
A vast number of tables can be found on the Web, which represents a valuable
knowledge resource. The objective of this survey is to synthesize and present
two decades of research on web tables. In particular, we organize existing
literature into six main categories of information access tasks: table
extraction, table interpretation, table search, question answering, knowledge
base augmentation, and table augmentation. For each of these tasks, we identify
and describe seminal approaches, present relevant resources, and point out
interdependencies among the different tasks.
Authors' comments: ACM Transactions on Intelligent Systems and Technology. 11(2):
Article 13, January 2020
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
We introduce TV show Retrieval (TVR), a new multimodal retrieval dataset. TVR
requires systems to understand both videos and their associated subtitle
(dialogue) texts, making it more realistic. The dataset contains 109K queries
collected on 21.8K videos from 6 TV shows of diverse genres, where each query
is associated with a tight temporal window. The queries are also labeled with
query types that indicate whether each of them is more related to video or
subtitle or both, allowing for in-depth analysis of the dataset and the methods
that built on top of it. Strict qualification and post-annotation verification
tests are applied to ensure the quality of the collected data. Further, we
present several baselines and a novel Cross-modal Moment Localization (XML )
network for multimodal moment retrieval tasks. The proposed XML model uses a
late fusion design with a novel Convolutional Start-End detector (ConvSE),
surpassing baselines by a large margin and with better efficiency, providing a
strong starting point for future work. We have also collected additional
descriptions for each annotated moment in TVR to form a new multimodal
captioning dataset with 262K captions, named TV show Caption (TVC). Both
datasets are publicly available. TVR: https://tvr.cs.unc.edu, TVC:
https://tvr.cs.unc.edu/tvc.html.
Authors' comments: ECCV 2020 (extended version, with TVC dataset+models; 35 pages)
Tony Ng, Vassileios Balntas, Yurun Tian, Krystian Mikolajczyk
Recent works in deep-learning have shown that second-order information is
beneficial in many computer-vision tasks. Second-order information can be
enforced both in the spatial context and the abstract feature dimensions. In
this work, we explore two second-order components. One is focused on
second-order spatial information to increase the performance of image
descriptors, both local and global. It is used to re-weight feature maps, and
thus emphasise salient image locations that are subsequently used for
description. The second component is concerned with a second-order similarity
(SOS) loss, that we extend to global descriptors for image retrieval, and is
used to enhance the triplet loss with hard-negative mining. We validate our
approach on two different tasks and datasets for image retrieval and image
matching. The results show that our two second-order components complement each
other, bringing significant performance improvements in both tasks and lead to
state-of-the-art results across the public benchmarks. Code available at:
http://github.com/tonyngjichun/SOLAR
Authors' comments: ECCV 2020
Xiaodong Wang, Zhedong Zheng, Yang He, Fei Yan, Zhiqiang Zeng, Yi Yang
This paper focuses on network pruning for image retrieval acceleration.
Prevailing image retrieval works target at the discriminative feature learning,
while little attention is paid to how to accelerate the model inference, which
should be taken into consideration in real-world practice. The challenge of
pruning image retrieval models is that the middle-level feature should be
preserved as much as possible. Such different requirements of the retrieval and
classification model make the traditional pruning methods not that suitable for
our task. To solve the problem, we propose a new Progressive Local Filter
Pruning (PLFP) method for image retrieval acceleration. Specifically, layer by
layer, we analyze the local geometric properties of each filter and select the
one that can be replaced by the neighbors. Then we progressively prune the
filter by gradually changing the filter weights. In this way, the
representation ability of the model is preserved. To verify this, we evaluate
our method on two widely-used image retrieval datasets,i.e., Oxford5k and
Paris6K, and one person re-identification dataset,i.e., Market-1501. The
proposed method arrives with superior performance to the conventional pruning
methods, suggesting the effectiveness of the proposed method for image
retrieval.
Authors' comments: 7 pages, 5 figures
Hsuan-Yin Lin, Siddhartha Kumar, Eirik Rosnes, Alexandre Graell i Amat, Eitan Yaakobi
A private information retrieval (PIR) protocol guarantees that a user can
privately retrieve files stored in a database without revealing any information
about the identity of the requested file. Existing information-theoretic PIR
protocols ensure perfect privacy, i.e., zero information leakage to the servers
storing the database, but at the cost of high download. In this work, we
present weakly-private information retrieval (WPIR) schemes that trade off
perfect privacy to improve the download cost when the database is stored on a
single server. We study the tradeoff between the download cost and information
leakage in terms of mutual information (MI) and maximal leakage (MaxL) privacy
metrics. By relating the WPIR problem to rate-distortion theory, the
download-leakage function, which is defined as the minimum required download
cost of all single-server WPIR schemes for a given level of information leakage
and a fixed file size, is introduced. By characterizing the download-leakage
function for the MI and MaxL metrics, the capacity of single-server WPIR is
fully described.
Authors' comments: To appear in IEEE Journal of Selected Areas in Information Theory
(JSAIT), Special Issue on Privacy and Security of Information Systems, 2021
Guillaume Cabanac, Ingo Frommholz, Philipp Mayr
The Bibliometric-enhanced Information Retrieval workshop series (BIR) was
launched at ECIR in 2014 \cite{MayrEtAl2014} and it was held at ECIR each year
since then. This year we organize the 10th iteration of BIR. The workshop
series at ECIR and JCDL/SIGIR tackles issues related to academic search, at the
crossroads between Information Retrieval, Natural Language Processing and
Bibliometrics. In this overview paper, we summarize the past workshops, present
the workshop topics for 2020 and reflect on some future steps for this workshop
series.
Authors' comments: Overview paper submitted to ECIR 2020, Lisbon, PT. arXiv admin note:
substantial text overlap with arXiv:1909.04954
Lukas Holzbaur, Camilla Hollanti, Antonia Wachter-Zeh
A new computational private information retrieval (PIR) scheme based on random linear codes is presented. A matrix of messages from a McEliece scheme is used to query the server with carefully chosen errors. The server responds with the sum of the scalar multiple of the rows of the query matrix and the files. The user recovers the desired file by erasure decoding the response. Contrary to code-based cryptographic systems, the scheme presented here enables to use truly random codes, not only codes disguised as such. Further, we show the relation to the so-called error subspace search problem and quotient error search problem, which we assume to be difficult, and show that the scheme is secure against attacks based on solving these problems.
Anubha Pandey, Ashish Mishra, Vinay Kumar Verma, Anurag Mittal, Hema A. Murthy
Conventional approaches to Sketch-Based Image Retrieval (SBIR) assume that
the data of all the classes are available during training. The assumption may
not always be practical since the data of a few classes may be unavailable, or
the classes may not appear at the time of training. Zero-Shot Sketch-Based
Image Retrieval (ZS-SBIR) relaxes this constraint and allows the algorithm to
handle previously unseen classes during the test. This paper proposes a
generative approach based on the Stacked Adversarial Network (SAN) and the
advantage of Siamese Network (SN) for ZS-SBIR. While SAN generates a
high-quality sample, SN learns a better distance metric compared to that of the
nearest neighbor search. The capability of the generative model to synthesize
image features based on the sketch reduces the SBIR problem to that of an
image-to-image retrieval problem. We evaluate the efficacy of our proposed
approach on TU-Berlin, and Sketchy database in both standard ZSL and
generalized ZSL setting. The proposed method yields a significant improvement
in standard ZSL as well as in a more challenging generalized ZSL setting (GZSL)
for SBIR.
Authors' comments: Accepted in WACV'2020
Matteo Allaix, Lukas Holzbaur, Tefjol Pllaha, Camilla Hollanti
In the classical private information retrieval (PIR) setup, a user wants to retrieve a file from a database or a distributed storage system (DSS) without revealing the file identity to the servers holding the data. In the quantum PIR (QPIR) setting, a user privately retrieves a classical file by receiving quantum information from the servers. The QPIR problem has been treated by Song \emph{et al.} in the case of replicated servers, both without collusion and with all but one servers colluding. In this paper, the QPIR setting is extended to account for maximum distance separable (MDS) coded servers. The proposed protocol works for any $[n,k]$-MDS code and $t$-collusion with $t=n-k$. Similarly to the previous cases, the rates achieved are better than those known or conjectured in the classical counterparts. Further, it is demonstrated how the protocol can adapted to achieve significantly higher retrieval rates from DSSs encoded with a locally repairable code (LRC) with disjoint repair groups, each of which is an MDS code.
Seunghoan Song, Masahito Hayashi
Quantum private information retrieval (QPIR) is a protocol in which a user retrieves one of multiple files from $\mathsf{n}$ non-communicating servers by downloading quantum systems without revealing which file is retrieved. As variants of QPIR with stronger security requirements, symmetric QPIR is a protocol in which no other files than the target file are leaked to the user, and $\mathsf{t}$-private QPIR is a protocol in which the identity of the target file is kept secret even if at most $\mathsf{t}$ servers may collude to reveal the identity. The QPIR capacity is the maximum ratio of the file size to the size of downloaded quantum systems, and we prove that the symmetric $\mathsf{t}$-private QPIR capacity is $\min\{1,2(\mathsf{n}-\mathsf{t})/\mathsf{n}\}$ for any $1\leq \mathsf{t}< \mathsf{n}$. We construct a capacity-achieving QPIR protocol by the stabilizer formalism and prove the optimality of our protocol. The proposed capacity is greater than the classical counterpart.
Luca Papariello, Alexandros Bampoulidis, Mihai Lupu
We replicate recent experiments attempting to demonstrate an attractive hypothesis about the use of the Fisher kernel framework and mixture models for aggregating word embeddings towards document representations and the use of these representations in document classification, clustering, and retrieval. Specifically, the hypothesis was that the use of a mixture model of von Mises-Fisher (VMF) distributions instead of Gaussian distributions would be beneficial because of the focus on cosine distances of both VMF and the vector space model traditionally used in information retrieval. Previous experiments had validated this hypothesis. Our replication was not able to validate it, despite a large parameter scan space.
Lu Wang, Jie Yang
Due to the superiority in similarity computation and database storage for
large-scale multiple modalities data, cross-modal hashing methods have
attracted extensive attention in similarity retrieval across the heterogeneous
modalities. However, there are still some limitations to be further taken into
account: (1) most current CMH methods transform real-valued data points into
discrete compact binary codes under the binary constraints, limiting the
capability of representation for original data on account of abundant loss of
information and producing suboptimal hash codes; (2) the discrete binary
constraint learning model is hard to solve, where the retrieval performance may
greatly reduce by relaxing the binary constraints for large quantization error;
(3) handling the learning problem of CMH in a symmetric framework, leading to
difficult and complex optimization objective. To address above challenges, in
this paper, a novel Asymmetric Correlation Quantization Hashing (ACQH) method
is proposed. Specifically, ACQH learns the projection matrixs of heterogeneous
modalities data points for transforming query into a low-dimensional
real-valued vector in latent semantic space and constructs the stacked
compositional quantization embedding in a coarse-to-fine manner for indicating
database points by a series of learnt real-valued codeword in the codebook with
the help of pointwise label information regression simultaneously. Besides, the
unified hash codes across modalities can be directly obtained by the discrete
iterative optimization framework devised in the paper. Comprehensive
experiments on diverse three benchmark datasets have shown the effectiveness
and rationality of ACQH.
Authors' comments: 12 pages
Xinyu Yao, Nan Liu, Wei Kang
We study the private information retrieval (PIR) problem under arbitrary
collusion pattern for replicated databases. We find its capacity, which is the
same as the capacity of the original PIR problem with the number of databases
$N$ replaced by a number $S^*$. The number $S^*$ is the optimal solution to a
linear programming problem that is a function of the collusion pattern. Hence,
the collusion pattern affects the capacity of the PIR problem only through the
number $S^*$.
Authors' comments: This is a full version of the paper submitted to ISIT 2020
Bariscan Yonel, Birsen Yazici
In this paper, we analyze the non-convex framework of Wirtinger Flow (WF) for
phase retrieval and identify a novel sufficient condition for universal exact
recovery through the lens of low rank matrix recovery theory. Via a perspective
in the lifted domain, we show that the convergence of the WF iterates to a true
solution is attained geometrically under a single condition on the lifted
forward model. As a result, a deterministic relationship between the accuracy
of spectral initialization and the validity of {the regularity condition} is
derived. In particular, we determine that a certain concentration property on
the spectral matrix must hold uniformly with a sufficiently tight constant.
This culminates into a sufficient condition that is equivalent to a restricted
isometry-type property over rank-1, positive semi-definite matrices, and
amounts to a less stringent requirement on the lifted forward model than those
of prominent low-rank-matrix-recovery methods in the literature. We
characterize the performance limits of our framework in terms of the tightness
of the concentration property via novel bounds on the convergence rate and on
the signal-to-noise ratio such that the theoretical guarantees are valid using
the spectral initialization at the proper sample complexity.
Authors' comments: In Revision for IEEE Transactions on Signal Processing