Jinbiao Wu, Yi Peng, Zaiming Liu
In this paper, we analyze a retrial queueing system with Batch Markovian
Arrival Processes and two types of customers. The rate of individual repeated
attempts from the orbit is modulated according to a Markov Modulated Poisson
Process. Using the theory of multi-dimensional asymptotically quasi-Toeplitz
Markov chain, we obtain the stability condition and the algorithm for
calculating the stationary state distribution of the system. Main performance
measures are presented. Furthermore, we investigate some optimization problems.
The algorithm for determining the optimal number of guard servers and total
servers is elaborated. Finally, this queueing system is applied to the cellular
wireless network. Numerical results to illustrate the optimization problems and
the impact of retrial on performance measures are provided. We find that the
performance measures are mainly affected by the two types of customers'
arrivals and service patterns, but the retrial rate plays a less crucial role.
Authors' comments: 28 pages with 3 figures
T. E. Gureyev, Ya. I. Nesterets
We show that an arbitrary spatial distribution of complex refractive index inside an object can be exactly represented as a sum of two "monomorphous" complex distributions, i.e. the distributions with the ratios of the real part to the imaginary part being constant throughout the object. A priori knowledge of constituent materials can be used to estimate the global lower and upper boundaries for this ratio. This approach can be viewed as an extension of the successful phase-retrieval method, based on the Transport of Intensity equation, that was previously developed for monomorphous (homogeneous) objects, such as e.g. objects consisting of a single material. We demonstrate that the monomorphous decomposition can lead to more stable methods for phase retrieval using the Transport of Intensity Equation. Such methods may find application in quantitative in-line phase-contrast imaging and phase-contrast tomography.
Francisco Raposo, Ricardo Ribeiro, David Martins de Matos
In order to satisfy processing time constraints, many MIR tasks process only
a segment of the whole music signal. This practice may lead to decreasing
performance, since the most important information for the tasks may not be in
those processed segments. In this paper, we leverage generic summarization
algorithms, previously applied to text and speech summarization, to summarize
items in music datasets. These algorithms build summaries, that are both
concise and diverse, by selecting appropriate segments from the input signal
which makes them good candidates to summarize music as well. We evaluate the
summarization process on binary and multiclass music genre classification
tasks, by comparing the performance obtained using summarized datasets against
the performances obtained using continuous segments (which is the traditional
method used for addressing the previously mentioned time constraints) and full
songs of the same original dataset. We show that GRASSHOPPER, LexRank, LSA,
MMR, and a Support Sets-based Centrality model improve classification
performance when compared to selected 30-second baselines. We also show that
summarized datasets lead to a classification performance whose difference is
not statistically significant from using full songs. Furthermore, we make an
argument stating the advantages of sharing summarized datasets for future MIR
research.
Authors' comments: 24 pages, 10 tables; Submitted to IEEE/ACM Transactions on Audio,
Speech and Language Processing
Prabhjot Singh, Sumit Dhawan, Shubham Agarwal, Narina Thakur
This paper exemplifies the implementation of an efficient Information
Retrieval (IR) System to compute the similarity between a dataset and a query
using Fuzzy Logic. TREC dataset has been used for the same purpose. The dataset
is parsed to generate keywords index which is used for the similarity
comparison with the user query. Each query is assigned a score value based on
its fuzzy similarity with the index keywords. The relevant documents are
retrieved based on the score value. The performance and accuracy of the
proposed fuzzy similarity model is compared with Cosine similarity model using
Precision-Recall curves. The results prove the dominance of Fuzzy Similarity
based IR system.
Authors' comments: arXiv admin note: substantial text overlap with
http://ntz-develop.blogspot.in/ ,
http://www.micsymposium.org/mics2012/submissions/mics2012_submission_8.pdf ,
http://www.slideshare.net/JeffreyStricklandPhD/predictive-modeling-and-analytics-selectchapters-41304405
by other authors
Dmytro Filatov, Taras Filatov
In computer interfaces in general, especially in information retrieval tasks, it is important to be able to quickly find and retrieve information. State of the art approach, used, for example, in search engines, is not effective as it introduces losses of meanings due to context to keywords back and forth translation. Authors argue it increases the time and reduces the accuracy of information retrieval compared to what it could be in the system that employs modern information retrieval and text mining methods while presenting results in an adaptive human- computer interface where system effectively learns what operator needs through iterative interaction. In current work, a combination of adaptive navigational interface and real time collaborative feedback analysis for documents relevance weighting is proposed as an viable alternative to prevailing "telegraphic" approach in information retrieval systems. Adaptive navigation is provided through a dynamic links panel controlled by an evolutionary algorithm. Documents relevance is initially established with standard information retrieval techniques and is further refined in real time through interaction of users with the system. Introduced concepts of multidimensional Knowledge Map and Weighted Point of Interest allow finding relevant documents and users with common interests through a trivial calculation. Browsing search approach, the ability of the algorithm to adapt navigation to users interests, collaborative refinement and the self-organising features of the system are the main factors making such architecture effective in various fields where non-structured knowledge shall be represented to the users.
Seung-Hoon Na
The standard approach for term frequency normalization is based only on the
document length. However, it does not distinguish the verbosity from the scope,
these being the two main factors determining the document length. Because the
verbosity and scope have largely different effects on the increase in term
frequency, the standard approach can easily suffer from insufficient or
excessive penalization depending on the specific type of long document. To
overcome these problems, this paper proposes two-stage normalization by
performing verbosity and scope normalization separately, and by employing
different penalization functions. In verbosity normalization, each document is
pre-normalized by dividing the term frequency by the verbosity of the document.
In scope normalization, an existing retrieval model is applied in a
straightforward manner to the pre-normalized document, finally leading us to
formulate our proposed verbosity normalized (VN) retrieval model. Experimental
results carried out on standard TREC collections demonstrate that the VN model
leads to marginal but statistically significant improvements over standard
retrieval models.
Authors' comments: 40 pages (to appear in ACM TOIS)
Changqing Zou, Zhe Huang, Rynson W. H. Lau, Jianzhuang Liu, Hongbo Fu
We present a multi-scale approach to sketch-based shape retrieval. It is based on a novel multi-scale shape descriptor called Pyramidof- Parts, which encodes the features and spatial relationship of the semantic parts of query sketches. The same descriptor can also be used to represent 2D projected views of 3D shapes, allowing effective matching of query sketches with 3D shapes across multiple scales. Experimental results show that the proposed method outperforms the state-of-the-art method, whether the sketch segmentation information is obtained manually or automatically by considering each stroke as a semantic part.
Zhenzhong Lan, Xuanchong Li, Ming Lin, Alexander G. Hauptmann
We propose a method for representing motion information for video
classification and retrieval. We improve upon local descriptor based methods
that have been among the most popular and successful models for representing
videos. The desired local descriptors need to satisfy two requirements: 1) to
be representative, 2) to be discriminative. Therefore, they need to occur
frequently enough in the videos and to be be able to tell the difference among
different types of motions. To generate such local descriptors, the video
blocks they are based on must contain just the right amount of motion
information. However, current state-of-the-art local descriptor methods use
video blocks with a single fixed size, which is insufficient for covering
actions with varying speeds. In this paper, we introduce a long-short term
motion feature that generates descriptors from video blocks with multiple
lengths, thus covering motions with large speed variance. Experimental results
show that, albeit simple, our model achieves state-of-the-arts results on
several benchmark datasets.
Authors' comments: arXiv admin note: text overlap with arXiv:1411.6660
Ronan Cummins, Jiaul Hoque Paik, Yuanhua Lv
The multinomial language model has been one of the most effective models of
retrieval for over a decade. However, the multinomial distribution does not
model one important linguistic phenomenon relating to term-dependency, that is
the tendency of a term to repeat itself within a document (i.e. word
burstiness). In this article, we model document generation as a random process
with reinforcement (a multivariate Polya process) and develop a Dirichlet
compound multinomial language model that captures word burstiness directly.
We show that the new reinforced language model can be computed as efficiently
as current retrieval models, and with experiments on an extensive set of TREC
collections, we show that it significantly outperforms the state-of-the-art
language model for a number of standard effectiveness metrics. Experiments also
show that the tuning parameter in the proposed model is more robust than in the
multinomial language model. Furthermore, we develop a constraint for the
verbosity hypothesis and show that the proposed model adheres to the
constraint. Finally, we show that the new language model essentially introduces
a measure closely related to idf which gives theoretical justification for
combining the term and document event spaces in tf-idf type schemes.
Authors' comments: 37 page journal submission (accepted for publication in TOIS)
Xionghao Liu, Wei Yang, Liang Lin, Qing Wang, Zhaoquan Cai, Jianhuang Lai
This article investigates a data-driven approach for semantically scene
understanding, without pixelwise annotation and classifier training. Our
framework parses a target image with two steps: (i) retrieving its exemplars
(i.e. references) from an image database, where all images are unsegmented but
annotated with tags; (ii) recovering its pixel labels by propagating semantics
from the references. We present a novel framework making the two steps mutually
conditional and bootstrapped under the probabilistic Expectation-Maximization
(EM) formulation. In the first step, the references are selected by jointly
matching their appearances with the target as well as the semantics (i.e. the
assigned labels of the target and the references). We process the second step
via a combinatorial graphical representation, in which the vertices are
superpixels extracted from the target and its selected references. Then we
derive the potentials of assigning labels to one vertex of the target, which
depend upon the graph edges that connect the vertex to its spatial neighbors of
the target and to its similar vertices of the references. Besides, the proposed
framework can be naturally applied to perform image annotation on new test
images. In the experiments, we validate our approach on two public databases,
and demonstrate superior performances over the state-of-the-art methods in both
semantic segmentation and image annotation tasks.
Authors' comments: 8 pages, 5 figures
Fang Zhao, Yongzhen Huang, Liang Wang, Tieniu Tan
With the rapid growth of web images, hashing has received increasing
interests in large scale image retrieval. Research efforts have been devoted to
learning compact binary codes that preserve semantic similarity based on
labels. However, most of these hashing methods are designed to handle simple
binary similarity. The complex multilevel semantic structure of images
associated with multiple labels have not yet been well explored. Here we
propose a deep semantic ranking based method for learning hash functions that
preserve multilevel semantic similarity between multi-label images. In our
approach, deep convolutional neural network is incorporated into hash functions
to jointly learn feature representations and mappings from them to hash codes,
which avoids the limitation of semantic representation power of hand-crafted
features. Meanwhile, a ranking list that encodes the multilevel similarity
information is employed to guide the learning of such deep hash functions. An
effective scheme based on surrogate loss is used to solve the intractable
optimization problem of nonsmooth and multivariate ranking measures involved in
the learning procedure. Experimental results show the superiority of our
proposed approach over several state-of-the-art hashing methods in term of
ranking evaluation metrics when tested on multi-label image datasets.
Authors' comments: CVPR 2015
Murtuza Ali Abidini, Onno Boxma, Jacques Resing
We study a vacation-type queueing model, and a single-server multi-queue
polling model, with the special feature of retrials. Just before the server
arrives at a station there is some deterministic glue period. Customers (both
new arrivals and retrials) arriving at the station during this glue period will
be served during the visit of the server. Customers arriving in any other
period leave immediately and will retry after an exponentially distributed
time. Our main focus is on queue length analysis, both at embedded time points
(beginnings of glue periods, visit periods and switch- or vacation periods) and
at arbitrary time points.
Authors' comments: Keywords: vacation queue, polling model, retrials Submitted for
review to Performance evaluation journal, as an extended version of 'Vacation
and polling models with retrials', by Onno Boxma and Jacques Resing
Philipp Mayr, Ingo Frommholz, Andrea Scharnhorst, Peter Mutschke
This workshop brings together experts of communities which often have been
perceived as different once: bibliometrics / scientometrics / informetrics on
the one side and information retrieval on the other. Our motivation as
organizers of the workshop started from the observation that main discourses in
both fields are different, that communities are only partly overlapping and
from the belief that a knowledge transfer would be profitable for both sides.
Bibliometric techniques are not yet widely used to enhance retrieval processes
in digital libraries, although they offer value-added effects for users. On the
other side, more and more information professionals, working in libraries and
archives are confronted with applying bibliometric techniques in their
services. This way knowledge exchange becomes more urgent. The first workshop
set the research agenda, by introducing in each other methods, reporting about
current research problems and brainstorming about common interests. This
follow-up workshop continues the overall communication, but also puts one
problem into the focus. In particular, we will explore how statistical
modelling of scholarship can improve retrieval services for specific
communities, as well as for large, cross-domain collections like Mendeley or
ResearchGate. This second BIR workshop continues to raise awareness of the
missing link between Information Retrieval (IR) and bibliometrics and
contributes to create a common ground for the incorporation of
bibliometric-enhanced services into retrieval at the scholarly search engine
interface.
Authors' comments: 4 pages, 37th European Conference on Information Retrieval, BIR
workshop
Yin Sun, Zizhan Zheng, C. Emre Koksal, Kyu-Han Kim, Ness B. Shroff
One key requirement for storage clouds is to be able to retrieve data
quickly. Recent system measurements have shown that the data retrieving delay
in storage clouds is highly variable, which may result in a long latency tail.
One crucial idea to improve the delay performance is to retrieve multiple data
copies by using parallel downloading threads. However, how to optimally
schedule these downloading threads to minimize the data retrieving delay
remains to be an important open problem. In this paper, we develop
low-complexity thread scheduling policies for several important classes of data
downloading time distributions, and prove that these policies are either
delay-optimal or within a constant gap from the optimum delay performance.
These theoretical results hold for an arbitrary arrival process of read
requests that may contain finite or infinite read requests, and for
heterogeneous MDS storage codes that can support diverse storage redundancy and
reliability requirements for different data files. Our numerical results show
that the delay performance of the proposed policies is significantly better
than that of First-Come- First-Served (FCFS) policies considered in prior work.
Authors' comments: 17 pages, 4 figures. This is the technical report for a conference
paper accepted by IEEE INFOCOM 2015
H. Palangi, L. Deng, Y. Shen, J. Gao, X. He, J. Chen, X. Song, R. Ward
In this paper we address the following problem in web document and information retrieval (IR): How can we use long-term context information to gain better IR performance? Unlike common IR methods that use bag of words representation for queries and documents, we treat them as a sequence of words and use long short term memory (LSTM) to capture contextual dependencies. To the best of our knowledge, this is the first time that LSTM is applied to information retrieval tasks. Unlike training traditional LSTMs, the training strategy is different due to the special nature of information retrieval problem. Experimental evaluation on an IR task derived from the Bing web search demonstrates the ability of the proposed method in addressing both lexical mismatch and long-term context modelling issues, thereby, significantly outperforming existing state of the art methods for web document retrieval task.
Johannes Niedermayer, Peer Kröger
To increase the computational efficiency of interest-point based object retrieval, researchers have put remarkable research efforts into improving the efficiency of kNN-based feature matching, pursuing to match thousands of features against a database within fractions of a second. However, due to the high-dimensional nature of image features that reduces the effectivity of index structures (curse of dimensionality), due to the vast amount of features stored in image databases (images are often represented by up to several thousand features), this ultimate goal demanded to trade query runtimes for query precision. In this paper we address an approach complementary to indexing in order to improve the runtimes of retrieval by querying only the most promising keypoint descriptors, as this affects matching runtimes linearly and can therefore lead to increased efficiency. As this reduction of kNN queries reduces the number of tentative correspondences, a loss of query precision is minimized by an additional image-level correspondence generation stage with a computational performance independent of the underlying indexing structure. We evaluate such an adaption of the standard recognition pipeline on a variety of datasets using both SIFT and state-of-the-art binary descriptors. Our results suggest that decreasing the number of queried descriptors does not necessarily imply a reduction in the result quality as long as alternative ways of increasing query recall (by thoroughly selecting k) and MAP (using image-level correspondence generation) are considered.
Ramtin Pedarsani, Kangwook Lee, Kannan Ramchandran
In this paper, we tackle the general compressive phase retrieval problem. The
problem is to recover a K-sparse complex vector of length n, $x\in
\mathbb{C}^n$, from the magnitudes of m linear measurements, $y=|Ax|$, where $A
\in \mathbb{C}^{m \times n}$ can be designed, and the magnitudes are taken
component-wise for vector $Ax\in \mathbb{C}^m$. We propose a variant of the
PhaseCode algorithm, and show that, using an irregular left-degree sparse-graph
code construction, the algorithm can recover almost all the K non-zero signal
components using only slightly more than 4K measurements under some mild
assumptions, with optimal time and memory complexity of ${\cal O}(K)$. It is
known that the fundamental limit for the number of measurements in compressive
phase retrieval problem is $4K - o(K)$. To the best of our knowledge, this is
the first constructive capacity-approaching compressive phase retrieval
algorithm. As a second contribution, we propose another variant of the
PhaseCode algorithm that is based on a Compressive Sensing framework involving
sparse-graph codes. Our proposed algorithm is an instance of a more powerful
"separation" architecture that can be used to address the compressive
phase-retrieval problem in general. This modular design features a compressive
sensing outer layer, and a trigonometric-based phase-retrieval inner layer. The
compressive-sensing layer operates as a linear phase-aware compressive
measurement subsystem, while the trig-based phase-retrieval layer provides the
desired abstraction between the actually targeted nonlinear phase-retrieval
problem and the induced linear compressive-sensing problem. Invoking this
architecture based on the use of sparse-graph codes for the compressive sensing
layer, we show that we can exactly recover a signal from only the magnitudes of
its linear measurements using only slightly more than 6K measurements.
Authors' comments: arXiv admin note: text overlap with arXiv:1408.0034
Lu Yu, Junming Huang, Chuang Liu, Zike Zhang
Interactions between search and recommendation have recently attracted
significant attention, and several studies have shown that many potential
applications involve with a joint problem of producing recommendations to users
with respect to a given query, termed $Collaborative$ $Retrieval$ (CR).
Successful algorithms designed for CR should be potentially flexible at dealing
with the sparsity challenges since the setup of collaborative retrieval
associates with a given $query$ $\times$ $user$ $\times$ $item$ tensor instead
of traditional $user$ $\times$ $item$ matrix. Recently, several works are
proposed to study CR task from users' perspective. In this paper, we aim to
sufficiently explore the sophisticated relationship of each $query$ $\times$
$user$ $\times$ $item$ triple from items' perspective. By integrating
item-based collaborative information for this joint task, we present an
alternative factorized model that could better evaluate the ranks of those
items with sparse information for the given query-user pair. In addition, we
suggest to employ a recently proposed scalable ranking learning algorithm,
namely BPR, to optimize the state-of-the-art approach, $Latent$ $Collaborative$
$Retrieval$ model, instead of the original learning algorithm. The experimental
results on two real-world datasets, (i.e. \emph{Last.fm}, \emph{Yelp}),
demonstrate the efficiency and effectiveness of our proposed approach.
Authors' comments: 10 pages, conference
Simon Birkholz, Günter Steinmeyer, Sebastian Koke, Daniel Gerth, Steven Bürger, Bernd Hofmann
A novel variant of spectral phase interferometry for direct electric-field reconstruction (SPIDER) is introduced and experimentally demonstrated. Other than most previously demonstrated variants of SPIDER, our method is based on a third-order nonlinear optical effect, namely self-diffraction, rather than the second-order effect of sum-frequency generation. On one hand, self-diffraction (SD) substantially simplifies phase-matching capabilities for multi-octave spectra that cannot be hosted by second-order processes, given manufacturing limitations of crystal lengths in the few-micrometer range. On the other hand, however, SD SPIDER imposes an additional constraint as it effectively measures the spectral phase of a self-convolved spectrum rather than immediately measuring the fundamental phase. Reconstruction of the latter from the measured phase and the spectral amplitude of the fundamental turns out to be an ill-posed problem, which we address by a regularization approach. We discuss the numerical implementation in detail and apply it to measured data from a Ti:sapphire amplifier system. Our experimental demonstration used 40-fs pulses and a 500 $\mu$m thick BaF${}_2$ crystal to show that the SD SPIDER signal is sufficiently strong to be separable from stray light. Extrapolating these measurements to the thinnest conceivable nonlinear media, we predict that bandwidths well above two optical octaves can be measured by a suitably adapted SD SPIDER apparatus, enabling the direct characterization of pulses down to single-femtosecond pulse durations. Such characteristics appear out of range for any currently established pulse measurement technique.
Hamid Izadinia, Ali Farhadi, Aaron Hertzmann, Matthew D. Hoffman
This paper proposes direct learning of image classification from user-supplied tags, without filtering. Each tag is supplied by the user who shared the image online. Enormous numbers of these tags are freely available online, and they give insight about the image categories important to users and to image classification. Our approach is complementary to the conventional approach of manual annotation, which is extremely costly. We analyze of the Flickr 100 Million Image dataset, making several useful observations about the statistics of these tags. We introduce a large-scale robust classification algorithm, in order to handle the inherent noise in these tags, and a calibration procedure to better predict objective annotations. We show that freely available, user-supplied tags can obtain similar or superior results to large databases of costly manual annotations.