Tarak Nath Dey, G. S. Agarwal
We investigate whether it is possible to store and retrieve the intense probe pulse from a $\Lambda$-type homogeneous medium of cold atoms. Through numerical simulations we show that it is possible to store and retrieve the probe pulse which are not necessarily weak. As the intensity of the probe pulse increases, the retrieved pulse remains a replica of the original pulse, however there is overall broadening and loss of the intensity. These effects can be understood in terms of the dependence of absorption on the intensity of the probe. We include the dynamics of the control field, which becomes especially important as the intensity of the probe pulse increases. We use the theory of adiabatons [Grobe {\it et al.} Phys. Rev. Lett. {\bf 73}, 3183 (1994)] to understand the storage and retrieval of light pulses at moderate powers.
Katunobu Itou, Atsushi Fujii, Tetsuya Ishikawa
We report experimental results associated with speech-driven text retrieval, which facilitates retrieving information in multiple domains with spoken queries. Since users speak contents related to a target collection, we produce language models used for speech recognition based on the target collection, so as to improve both the recognition and retrieval accuracy. Experiments using existing test collections combined with dictated queries showed the effectiveness of our method.
Shigeto Higuchi, Masatoshi Fukui, Atsushi Fujii, Tetsuya Ishikawa
Given the growing number of patents filed in multiple countries, users are interested in retrieving patents across languages. We propose a multi-lingual patent retrieval system, which translates a user query into the target language, searches a multilingual database for patents relevant to the query, and improves the browsing efficiency by way of machine translation and clustering. Our system also extracts new translations from patent families consisting of comparable patents, to enhance the translation dictionary.
Atsushi Fujii, Katunobu Itou, Tetsuya Ishikawa
While recent retrieval techniques do not limit the number of index terms,
out-of-vocabulary (OOV) words are crucial in speech recognition. Aiming at
retrieving information with spoken queries, we fill the gap between speech
recognition and text retrieval in terms of the vocabulary size. Given a spoken
query, we generate a transcription and detect OOV words through speech
recognition. We then correspond detected OOV words to terms indexed in a target
collection to complete the transcription, and search the collection for
documents relevant to the completed transcription. We show the effectiveness of
our method by way of experiments.
Authors' comments: Proceedings of the 2002 Conference on Empirical Methods in Natural
Language Processing (To appear)
M. S. Mainieri, R. Erichsen Jr
We discuss, in this paper, the dynamical properties of extremely diluted,
non-monotonic neural networks. Assuming parallel updating and the Hebb
prescription for the synaptic connections, a flow equation for the macroscopic
overlap is derived. A rich dynamical phase diagram was obtained, showing a
stable retrieval phase, as well as a cycle two and chaotic behavior. Numerical
simulations were performed, showing good agreement with analytical results.
Furthermore, the simulations give an additional insight into the microscopic
dynamical behavior during the chaotic phase. It is shown that the freezing of
individual neuron states is related to the structure of chaotic attractors.
Authors' comments: 11 pages, 4 figures
R. Bakker, T. Birke, R. Mueller
The first two years of user service of the third generation light source BESSY II emphasized the importance of a reliable, comprehensive and dense logging of a few thousand setpoints, readbacks, status and alarm values. Today data from sources with various characteristics residing in different protected networks are centrally collected and retrievable via an uncomplex CGI program to any desktop system on the site. Data post-processing tools cover Windows applications, IDL, SDDS and custom programs matching users skills and preferences. In this paper illustrative sample data explorations are described that underline the importance of the logging system for operations as well as for the understanding of singular events or long term drifts. Serious shortcomings of the present installation and focus of further development are described.
Silvia Scarpetta, Zhaoping Li, John Hertz
We introduce a model of generalized Hebbian learning and retrieval in
oscillatory neural networks modeling cortical areas such as hippocampus and
olfactory cortex. Recent experiments have shown that synaptic plasticity
depends on spike timing, especially on synapses from excitatory pyramidal
cells, in hippocampus and in sensory and cerebellar cortex. Here we study how
such plasticity can be used to form memories and input representations when the
neural dynamics are oscillatory, as is common in the brain (particularly in the
hippocampus and olfactory cortex). Learning is assumed to occur in a phase of
neural plasticity, in which the network is clamped to external teaching
signals. By suitable manipulation of the nonlinearity of the neurons or of the
oscillation frequencies during learning, the model can be made, in a retrieval
phase, either to categorize new inputs or to map them, in a continuous fashion,
onto the space spanned by the imprinted patterns. We identify the first of
these possibilities with the function of olfactory cortex and the second with
the observed response characteristics of place cells in hippocampus. We
investigate both kinds of networks analytically and by computer simulations,
and we link the models with experimental findings, exploring, in particular,
how the spike timing dependence of the synaptic plasticity constrains the
computational function of the network and vice versa.
Authors' comments: 24 pages, 4 figures
Carlo Fulvi Mari
A model of the columnar functional organization of neocortical association
areas is studied. The neuronal network is composed of many Hebbian
autoassociators, or modules, each of which interacts with a relatively small
number of the others. Every module encodes and stores a number of elementary
percepts, or features. Memory items, or patterns, are peculiar combinations of
features sparsely distributed over the multi-modular network. Any feature
stored in any module can be involved in several of the stored patterns;
feature-sharing is in fact source of local ambiguities and, consequently, a
potential cause of erroneous memory retrieval activity spreading through the
model network.
The memory retrieval dynamics of the large multi-modular autoassociator is
investigated by means of quantitative analysis and numerical simulations. An
oscillatory retrieval process is found to be very efficient in overcoming
feature-sharing drawbacks; it requires a mechanism that modulates the
robustness of local attractors to noise, and neuronal activity sparseness such
that quiescent and active modules are about equally noisy. Correlated
activation of interconnected modules and extramodular neuronal contacts more
effective than the intramodular ones seem to be general requirements in order
to efficiently achieve satisfactory quality of memory retrieval. It is also
shown that, even in ideal conditions, some spots of the network cannot be
reached by retrieval activity spread. The locations of these activity isles
depend on the pattern to retrieve and on the cue, while their extension only
depends on architecture of the graph and statistics of the stored patterns. The
existence of these isles determines an upper-bound to retrieval quality that
does not depend on the specific retrieval dynamics adopted, nor on whether
feature-sharing is permitted. The oscillatory retrieval process nearly
saturates this bound.
Authors' comments: 23 pages, 8 figures (PDF), pdflatex. Pre-peer-review version (for
copyright reasons). The more readable post-peer-review author's version is
available on the author's personal webpage, which is accessible from
orcid.org and Google Scholar
David Elworthy
It might appear that natural language processing should improve the accuracy
of information retrieval systems, by making available a more detailed analysis
of queries and documents. Although past results appear to show that this is not
so, if the focus is shifted to short phrases rather than full documents, the
situation becomes somewhat different. The ANVIL system uses a natural language
technique to obtain high accuracy retrieval of images which have been annotated
with a descriptive textual caption. The natural language techniques also allow
additional contextual information to be derived from the relation between the
query and the caption, which can help users to understand the overall
collection of retrieval results. The techniques have been successfully used in
a information retrieval system which forms both a testbed for research and the
basis of a commercial system.
Authors' comments: Proceedings of CIKM 2000
C. Brunt, M. H. Heyer
We demonstrate the capability of Principal Component Analysis (PCA) as
applied by Heyer & Schloerb (1997) to extract the statistics of turbulent
interstellar velocity fields as measured by the energy spectrum, E(k)= k^-beta.
Turbulent velocity and density fields with known statistics are generated from
fBm simulations. These fields are translated to the observational domain,
T(x,y,v), considering the excitation of molecular rotational energy levels and
radiative transfer. Using PCA and the characterization of velocity and spatial
scales from the eigenvectors and eigenimages respectively, a relationship is
identified which describes the magnitude of line profile differences, deltav,
and the scale, L, over which these differences occur, deltav=L^alpha. From a
series of models with varying values of beta, we find, alpha = 0.33beta - 0.05
for 1 < beta < 3. This provides the basic calibration between the intrinsic
velocity field statistics to observational measures and a diagnostic for
turbulent flows in the interstellar medium. We also investigate the effects of
noise, line opacity, and finite resolution on these results.
Authors' comments: Accepted by ApJ, 25 pages (includes 10 figures, 3 tables)
Atsushi Fujii, Tetsuya Ishikawa
Cross-language information retrieval (CLIR), where queries and documents are
in different languages, needs a translation of queries and/or documents, so as
to standardize both of them into a common representation. For this purpose, the
use of machine translation is an effective approach. However, computational
cost is prohibitive in translating large-scale document collections. To resolve
this problem, we propose a two-stage CLIR method. First, we translate a given
query into the document language, and retrieve a limited number of foreign
documents. Second, we machine translate only those documents into the user
language, and re-rank them based on the translation result. We also show the
effectiveness of our method by way of experiments using Japanese queries and
English technical documents.
Authors' comments: 13 pages, 1 Postscript figure
Atsushi Fujii, Tetsuya Ishikawa
In information retrieval research, precision and recall have long been used
to evaluate IR systems. However, given that a number of retrieval systems
resembling one another are already available to the public, it is valuable to
retrieve novel relevant documents, i.e., documents that cannot be retrieved by
those existing systems. In view of this problem, we propose an evaluation
method that favors systems retrieving as many novel documents as possible. We
also used our method to evaluate systems that participated in the IREX
workshop.
Authors' comments: 5 pages
Masaki Murata, Qing Ma, Kiyotaka Uchimoto, Hiromi Ozaku, Masao Utiyama, Hitoshi Isahara
Robertson's 2-poisson information retrieve model does not use location and
category information. We constructed a framework using location and category
information in a 2-poisson model. We submitted two systems based on this
framework to the IREX contest, Japanese language information retrieval contest
held in Japan in 1999. For precision in the A-judgement measure they scored
0.4926 and 0.4827, the highest values among the 15 teams and 22 systems that
participated in the IREX contest. We describe our systems and the comparative
experiments done when various parameters were changed. These experiments
confirmed the effectiveness of using location and category information.
Authors' comments: 7,8 pages. Computation and Language. IRAL'2000, Hong Kong, September
30, 2000
Gianni Amati, Konstantinos Georgatos
The problem of Information Retrieval is, given a set of documents D and a
query q, providing an algorithm for retrieving all documents in D relevant to
q. However, retrieval should depend and be updated whenever the user is able to
provide as an input a preferred set of relevant documents; this process is
known as em relevance feedback. Recent work in IR has been paying great
attention to models which employ a logical approach; the advantage being that
one can have a simple computable characterization of retrieval on the basis of
a pure logical analysis of retrieval. Most of the logical models make use of
probabilities or similar belief functions in order to introduce the inductive
component whereby uncertainty is treated. Their general paradigm is the
following: em find the nature of conditional $d\imp q$ and then define a
probability on the top of it. We just reverse this point of view; first use the
numerical information, frequencies or probabilities, then define your own
logical consequence. More generally, we claim that retrieval is a form of
deduction. We introduce a simple but powerful logical framework of relevance
feedback, derived from the well founded area of nonmonotonic logic. This
description can help us evaluate, describe and compare from a theoretical point
of view previous approaches based on conditionals or probabilities.
Authors' comments: 6 pages, Abstract
Michael Hess
A system is described that uses a mixed-level representation of (part of)
meaning of natural language documents (based on standard Horn Clause Logic) and
a variable-depth search strategy that distinguishes between the different
levels of abstraction in the knowledge representation to locate specific
passages in the documents. Mixed-level representations as well as
variable-depth search strategies are applicable in fields outside that of NLP.
Authors' comments: 8 pages, Proceedings of the Eighth International Conference on Tools
with Artificial Intelligence (TAI'96), Los Alamitos CA
Martin Wechsler, Peter Schauble
A theoretic framework for multimedia information retrieval is introduced
which guarantees optimal retrieval effectiveness. In particular, a Ranking
Principle for Distributed Multimedia-Documents (RPDM) is described together
with an algorithm that satisfies this principle. Finally, the RPDM is shown to
be a generalization of the Probability Ranking principle (PRP) which guarantees
optimal retrieval effectiveness in the case of text document retrieval. The PRP
justifies theoretically the relevance ranking adopted by modern search engines.
In contrast to the classical PRP, the new RPDM takes into account transmission
and inspection time, and most importantly, aspectual recall rather than simple
recall.
Authors' comments: submission for DL'99. conference compliant format (two-column, etc.)
will be produced later
Sa-Kwang Song, Sung Hyon Myaeng
In any search-based digital library (DL) systems dealing with a non-trivial
number of documents, users are often required to go through a long list of
short document descriptions in order to identify what they are looking for. To
tackle the problem, a variety of document organization algorithms and/or
visualization techniques have been used to guide users in selecting relevant
documents. Since these techniques require heavy computations, however, we
developed a presentation server designed to serve as an intermediary between
retrieval servers and clients equipped with a visualization interface. In
addition, we designed our own visual interface by which users can view a set of
documents from different perspectives through layers of document maps. We
finally ran experiments to show that the visual interface, in conjunction with
the presentation server, indeed helps users in selecting relevant documents
from the retrieval results.
Authors' comments: 13 pages
Toshio Aoyagi, Masaki Nomura
Little is known theoretically about the associative memory capabilities of
neural networks in which information is encoded not only in the mean firing
rate but also in the timing of firings. Particularly, in the case that the
fraction of active neurons involved in memorizing patterns becomes small, it is
biologically important to consider the timings of firings and to study how such
consideration influences storage capacities and quality of recalled patterns.
For this purpose, we propose a simple extended model of oscillator neural
networks to allow for expression of non-firing state. %which is able to
memorize sparsely coded phase patterns including non-firing states. Analyzing
both equilibrium states and dynamical properties in recalling processes, we
find that the system possesses good associative memory.
Authors' comments: 9 pages, 3 Postscript figures, uses epsbox.sty
Davood Rafiei, Alberto Mendelzon
We propose an improvement of the known DFT-based indexing technique for fast retrieval of similar time sequences. We use the last few Fourier coefficients in the distance computation without storing them in the index since every coefficient at the end is the complex conjugate of a coefficient at the beginning and as strong as its counterpart. We show analytically that this observation can accelerate the search time of the index by more than a factor of two. This result was confirmed by our experiments, which were carried out on real stock prices and synthetic data.
Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan Cigarran
The classical, vector space model for text retrieval is shown to give better
results (up to 29% better in our experiments) if WordNet synsets are chosen as
the indexing space, instead of word forms. This result is obtained for a
manually disambiguated test collection (of queries and documents) derived from
the Semcor semantic concordance. The sensitivity of retrieval performance to
(automatic) disambiguation errors when indexing documents is also measured.
Finally, it is observed that if queries are not disambiguated, indexing by
synsets performs (at best) only as good as standard word indexing.
Authors' comments: 7 pages, LaTeX2e, 3 eps figures, uses epsfig, colacl.sty