M. Glowacki, J. R. Allison, E. M. Sadler, V. A. Moss, T. H. Jarrett
We show that mid-infrared data from the all-sky WISE survey can be used as a
robust photometric redshift indicator for powerful radio AGN, in the absence of
other spectroscopic or multi-band photometric information. Our work is
motivated by a desire to extend the well-known K-z relation for radio galaxies
to the wavelength range covered by the all-sky WISE mid-infrared survey. Using
the LARGESS radio spectroscopic sample as a training set, and the mid-infrared
colour information to classify radio sources, we generate a set of redshift
probability distributions for the hosts of high-excitation and low-excitation
radio AGN. We test the method using spectroscopic data from several other radio
AGN studies, and find good agreement between our WISE-based redshift estimates
and published spectroscopic redshifts out to z ~ 1 for galaxies and z ~ 3-4 for
radio-loud QSOs. Our chosen method is also compared against other
classification methods and found to perform reliably. This technique is likely
to be particularly useful in the analysis of upcoming large-area radio surveys
with SKA pathfinder telescopes, and our code is publicly available. As a
consistency check, we show that our WISE-based redshift estimates for sources
in the 843 MHz SUMSS survey reproduce the redshift distribution seen in the
CENSORS study up to z ~ 2. We also discuss two specific applications of our
technique for current and upcoming radio surveys; an interpretation of large
scale HI absorption surveys, and a determination of whether low-frequency
peaked spectrum sources lie at high redshift.
Authors' comments: 18 pages, 11 figures, 11 tables; submitted to MNRAS
Soufiane Belharbi, Clément Chatelain, Romain Hérault, Sébastien Adam
Training deep neural networks is known to require a large number of training
samples. However, in many applications only few training samples are available.
In this work, we tackle the issue of training neural networks for
classification task when few training samples are available. We attempt to
solve this issue by proposing a new regularization term that constrains the
hidden layers of a network to learn class-wise invariant representations. In
our regularization framework, learning invariant representations is generalized
to the class membership where samples with the same class should have the same
representation. Numerical experiments over MNIST and its variants showed that
our proposal helps improving the generalization of neural network particularly
when trained with few samples. We provide the source code of our framework
https://github.com/sbelharbi/learning-class-invariant-features .
Authors' comments: Submitted to ELSEVIER, 13 pages, 5 figures
Nian Liu, Junwei Han, Ming-Hsuan Yang
Contexts play an important role in the saliency detection task. However, given a context region, not all contextual information is helpful for the final task. In this paper, we propose a novel pixel-wise contextual attention network, i.e., the PiCANet, to learn to selectively attend to informative context locations for each pixel. Specifically, for each pixel, it can generate an attention map in which each attention weight corresponds to the contextual relevance at each context location. An attended contextual feature can then be constructed by selectively aggregating the contextual information. We formulate the proposed PiCANet in both global and local forms to attend to global and local contexts, respectively. Both models are fully differentiable and can be embedded into CNNs for joint training. We also incorporate the proposed models with the U-Net architecture to detect salient objects. Extensive experiments show that the proposed PiCANets can consistently improve saliency detection performance. The global and local PiCANets facilitate learning global contrast and homogeneousness, respectively. As a result, our saliency model can detect salient objects more accurately and uniformly, thus performing favorably against the state-of-the-art methods.
Sangheum Hwang, Sunggyun Park
We introduce an accurate lung segmentation model for chest radiographs based
on deep convolutional neural networks. Our model is based on atrous
convolutional layers to increase the field-of-view of filters efficiently. To
improve segmentation performances further, we also propose a multi-stage
training strategy, network-wise training, which the current stage network is
fed with both input images and the outputs from pre-stage network. It is shown
that this strategy has an ability to reduce falsely predicted labels and
produce smooth boundaries of lung fields. We evaluate the proposed model on a
common benchmark dataset, JSRT, and achieve the state-of-the-art segmentation
performances with much fewer model parameters.
Authors' comments: Accepted to the 3rd Workshop on Deep Learning in Medical Image
Analysis (DLMIA 2017), MICCAI 2017
C. A. P. Bengaly, C. P. Novaes, H. S. Xavier, M. Bilicki, A. Bernui, J. S. Alcaniz
We probe the isotropy of the Universe with the largest all-sky photometric
redshift dataset currently available, namely WISE~$\times$~SuperCOSMOS. We
search for dipole anisotropy of galaxy number counts in multiple redshift
shells within the $0.10 < z < 0.35$ range, for two subsamples drawn from the
same parent catalogue. Our results show that the dipole directions are in good
agreement with most of the previous analyses in the literature, and in most
redshift bins the dipole amplitudes are well consistent with $\Lambda$CDM-based
mocks in the cleanest sample of this catalogue. In the $z<0.15$ range, however,
we obtain a persistently large anisotropy in both subsamples of our dataset.
Overall, we report no significant evidence against the isotropy assumption in
this catalogue except for the lowest redshift ranges. The origin of the latter
discrepancy is unclear, and improved data may be needed to explain it.
Authors' comments: 5 pages, 4 figures, 2 tables. Published in MNRAS
Rolf Jagerman, Julia Kiseleva, Maarten de Rijke
List-wise learning to rank methods are considered to be the state-of-the-art. One of the major problems with these methods is that the ambiguous nature of relevance labels in learning to rank data is ignored. Ambiguity of relevance labels refers to the phenomenon that multiple documents may be assigned the same relevance label for a given query, so that no preference order should be learned for those documents. In this paper we propose a novel sampling technique for computing a list-wise loss that can take into account this ambiguity. We show the effectiveness of the proposed method by training a 3-layer deep neural network. We compare our new loss function to two strong baselines: ListNet and ListMLE. We show that our method generalizes better and significantly outperforms other methods on the validation and test sets.
Byeonghee Yu, J. Colin Hill, Blake D. Sherwin
Delensing, the removal of the limiting lensing B-mode background, is crucial
for the success of future cosmic microwave background (CMB) surveys in
constraining inflationary gravitational waves (IGWs). In recent work, delensing
with large-scale structure tracers has emerged as a promising method both for
improving constraints on IGWs and for testing delensing methods for future use.
However, the delensing fractions (i.e., the fraction of the lensing-B mode
power removed) achieved by recent efforts have been only $20-30\%$. In this
work, we provide a detailed characterization of a full-sky, dust-cleaned cosmic
infrared background (CIB) map for delensing and construct a further-improved
delensing template by adding additional tracers to increase delensing
performance. In particular, we build a multitracer delensing template by
combining the dust-cleaned Planck CIB map with a reconstructed CMB lensing map
from Planck and a galaxy number density map from the Wide-field Infrared Survey
Explorer (WISE) satellite. For this combination, we calculate the relevant
weightings by fitting smooth templates to measurements of all the cross- and
auto-spectra of these maps. On a large fraction of the sky
($f_\mathrm{sky}=0.43$), we demonstrate that our maps are capable of providing
a delensing factor of $43 \pm 1\%$; using a more restrictive mask
($f_\mathrm{sky}=0.11$), the delensing factor reaches $48 \pm 1\%$. For
low-noise surveys, our delensing maps, which cover much of the sky, can thus
improve constraints on the tensor-to-scalar ratio ($r$) by nearly a factor of
2. The delensing tracer maps are made publicly available, and we encourage
their use in ongoing and upcoming B-mode surveys.
Authors' comments: 10 pages, 7 figures, data products available at
http://www.sns.ias.edu/~jch/delens/
Ruediger Ehlers
We present an approach for the verification of feed-forward neural networks in which all nodes have a piece-wise linear activation function. Such networks are often used in deep learning and have been shown to be hard to verify for modern satisfiability modulo theory (SMT) and integer linear programming (ILP) solvers. The starting point of our approach is the addition of a global linear approximation of the overall network behavior to the verification problem that helps with SMT-like reasoning over the network behavior. We present a specialized verification algorithm that employs this approximation in a search process in which it infers additional node phases for the non-linear nodes in the network from partial node phase assignments, similar to unit propagation in classical SAT solving. We also show how to infer additional conflict clauses and safe node fixtures from the results of the analysis steps performed during the search. The resulting approach is evaluated on collision avoidance and handwritten digit recognition case studies.
Jianqiao Wangni
The $L_1$-regularized models are widely used for sparse regression or
classification tasks. In this paper, we propose the orthant-wise passive
descent algorithm (OPDA) for optimizing $L_1$-regularized models, as an
improved substitute of proximal algorithms, which are the standard tools for
optimizing the models nowadays. OPDA uses a stochastic variance-reduced
gradient (SVRG) to initialize the descent direction, then apply a novel
alignment operator to encourage each element keeping the same sign after one
iteration of update, so the parameter remains in the same orthant as before. It
also explicitly suppresses the magnitude of each element to impose sparsity.
The quasi-Newton update can be utilized to incorporate curvature information
and accelerate the speed. We prove a linear convergence rate for OPDA on
general smooth and strongly-convex loss functions. By conducting experiments on
$L_1$-regularized logistic regression and convolutional neural networks, we
show that OPDA outperforms state-of-the-art stochastic proximal algorithms,
implying a wide range of applications in training sparse models.
Authors' comments: Accepted to The Thirty-Second AAAI Conference on Artificial
Intelligence (AAAI-18). Feb 2018, New Orleans
Hanno Rein, Daniel Tamayo
Hamiltonian systems such as the gravitational N-body problem have
time-reversal symmetry. However, all numerical N-body integration schemes,
including symplectic ones, respect this property only approximately. In this
paper, we present the new N-body integrator JANUS, for which we achieve exact
time-reversal symmetry by combining integer and floating point arithmetic.
JANUS is explicit, formally symplectic and satisfies Liouville's theorem
exactly. Its order is even and can be adjusted between two and ten. We discuss
the implementation ofJANUS and present tests of its accuracy and speed by
performing and analyzing long-term integrations of the Solar System. We show
that JANUS is fast and accurate enough to tackle a broad class of dynamical
problems. We also discuss the practical and philosophical implications of
running exactly time-reversible simulations.
Authors' comments: Accepted for publication by MNRAS, 7 pages, 4 figures, source code
available at https://github.com/hannorein/rebound , iPython notebooks to
reproduce figures available at https://github.com/hannorein/JanusPaper
Mandar Kulkarni, Shirish Karande
Deep learning has shown promising results in many machine learning applications. The hierarchical feature representation built by deep networks enable compact and precise encoding of the data. A kernel analysis of the trained deep networks demonstrated that with deeper layers, more simple and more accurate data representations are obtained. In this paper, we propose an approach for layer-wise training of a deep network for the supervised classification task. A transformation matrix of each layer is obtained by solving an optimization aimed at a better representation where a subsequent layer builds its representation on the top of the features produced by a previous layer. We compared the performance of our approach with a DNN trained using back-propagation which has same architecture as ours. Experimental results on the real image datasets demonstrate efficacy of our approach. We also performed kernel analysis of layer representations to validate the claim of better feature encoding.
G. Mountrichas, I. Georgantopoulos, N. J. Secrest, I. Ordovas-Pascual, A. Corral, A. Akylas, S. Mateos, F. J. Carrera et al.
Mid-IR colour selection techniques have proved to be very efficient in
finding AGN. This is because the AGN heats the surrounding dust producing warm
mid-IR colours. Using the WISE 3.6, 4.5 and 12 $\mu m$ colours, the largest
sample of IR selected AGN has already been produced containing 1.4 million AGN
over the whole sky. Here, we explore the X-ray properties of this AGN sample by
cross-correlating it with the subsample of the 3XMM X-ray catalogue that has
available X-ray spectra and at the same time optical spectroscopy from SDSS.
Our goal is to find rare luminous obscured AGN. Our final sample contains 65
QSOs with $\rm{log}\,\nu L_\nu \ge 46.2$\,erg\,s$^{-1}$. This IR luminosity cut
corresponds to $\rm{log}\,L_X \approx 45$\,erg\,s$^{-1}$, at the median
redshift of our sample ($z=2.3$), that lies at the bright end of the X-ray
luminosity function at $z>2$. The X-ray spectroscopic analysis reveals seven
obscured AGN having a column density $\rm N_H>10^{22} cm^{-2}$. Six of them
show evidence for broad [CIV] absorption lines and five are classified as
BALQSOs. We fit the optical spectra of our X-ray absorbed sources to estimate
the optical reddening. We find that none of these show any obscuration
according to the optical continuum. These sources add to the growing evidence
for populations of luminous QSOs with evidence for substantial absorption by
outflowing ionised material, similar to those expected to be emerging from
their absorbing cocoons in the framework of AGN/galaxy co-evolution.
Authors' comments: 10 pages, 5 figures, 3 Tables, MNRAS accepted
Gilles Blanchard, Pierre Neuvial, Etienne Roquain
We introduce a general methodology for post hoc inference in a large-scale multiple testing framework. The approach is called "user-agnostic" in the sense that the statistical guarantee on the number of correct rejections holds for any set of candidate items selected by the user (after having seen the data). This task is investigated by defining a suitable criterion, named the joint-family-wise-error rate (JER for short). We propose several procedures for controlling the JER, with a special focus on incorporating dependencies while adapting to the unknown quantity of signal (via a step-down approach). We show that our proposed setting incorporates as particular cases a version of the higher criticism as well as the closed testing based approach of Goeman and Solari (2011). Our theoretical statements are supported by numerical experiments.
Bo Yang, Hui Liu, He Zhong, Zhangxin Chen
This research investigates the implementation mechanism of block-wise ILU(k)
preconditioner on GPU. The block-wise ILU(k) algorithm requires both the level
k and the block size to be designed as variables. A decoupled ILU(k) algorithm
consists of a symbolic phase and a factorization phase. In the symbolic phase,
a ILU(k) nonzero pattern is established from the point-wise structure extracted
from a block-wise matrix. In the factorization phase, the block-wise matrix
with a variable block size is factorized into a block lower triangular matrix
and a block upper triangular matrix. And a further diagonal factorization is
required to perform on the block upper triangular matrix for adapting a
parallel triangular solver on GPU.We also present the numerical experiments to
study the preconditioner actions on different k levels and block sizes.
Authors' comments: 14 pages
Anatol Odzijewicz, Grzegorz Jakimowicz, Aneta Sliżewska
In this paper we investigate fiber-wise linear complex Banach sub-Poisson
structures defined canonically by the structure of a W*-algebra M. In
particular we show that these structures are arranged in the short exact
sequence of complex Banach sub-Poisson VB-groupoids with the groupoid of
partially invertible elements of M as the side groupoid.
Authors' comments: 52 pages
Vitaly Feldman, Badih Ghazi
Several well-studied models of access to data samples, including statistical
queries, local differential privacy and low-communication algorithms rely on
queries that provide information about a function of a single sample. (For
example, a statistical query (SQ) gives an estimate of $Ex_{x \sim D}[q(x)]$
for any choice of the query function $q$ mapping $X$ to the reals, where $D$ is
an unknown data distribution over $X$.) Yet some data analysis algorithms rely
on properties of functions that depend on multiple samples. Such algorithms
would be naturally implemented using $k$-wise queries each of which is
specified by a function $q$ mapping $X^k$ to the reals. Hence it is natural to
ask whether algorithms using $k$-wise queries can solve learning problems more
efficiently and by how much.
Blum, Kalai and Wasserman (2003) showed that for any weak PAC learning
problem over a fixed distribution, the complexity of learning with $k$-wise SQs
is smaller than the (unary) SQ complexity by a factor of at most $2^k$. We show
that for more general problems over distributions the picture is substantially
richer. For every $k$, the complexity of distribution-independent PAC learning
with $k$-wise queries can be exponentially larger than learning with
$(k+1)$-wise queries. We then give two approaches for simulating a $k$-wise
query using unary queries. The first approach exploits the structure of the
problem that needs to be solved. It generalizes and strengthens (exponentially)
the results of Blum et al.. It allows us to derive strong lower bounds for
learning DNF formulas and stochastic constraint satisfaction problems that hold
against algorithms using $k$-wise queries. The second approach exploits the
$k$-party communication complexity of the $k$-wise query function.
Authors' comments: 32 pages, Appeared in Innovations in Theoretical Computer Science
(ITCS) 2017
Jiwoong Kim
Application of the minimum distance method to the linear regression model for estimating regression parameters is a difficult and time-consuming process due to the complexity of its distance function, and hence, it is computationally expensive. To deal with the computational cost, this paper proposes a fast algorithm which mainly uses technique of coordinate-wise minimization in order to estimate the regression parameters. R package based on the proposed algorithm and written in Rcpp is available online.
Žiga Emeršič, Luka Lan Gabriel, Vitomir Štruc, Peter Peer
Object detection and segmentation represents the basis for many tasks in
computer and machine vision. In biometric recognition systems the detection of
the region-of-interest (ROI) is one of the most crucial steps in the overall
processing pipeline, significantly impacting the performance of the entire
recognition system. Existing approaches to ear detection, for example, are
commonly susceptible to the presence of severe occlusions, ear accessories or
variable illumination conditions and often deteriorate in their performance if
applied on ear images captured in unconstrained settings. To address these
shortcomings, we present in this paper a novel ear detection technique based on
convolutional encoder-decoder networks (CEDs). For our technique, we formulate
the problem of ear detection as a two-class segmentation problem and train a
convolutional encoder-decoder network based on the SegNet architecture to
distinguish between image-pixels belonging to either the ear or the non-ear
class. The output of the network is then post-processed to further refine the
segmentation result and return the final locations of the ears in the input
image. Different from competing techniques from the literature, our approach
does not simply return a bounding box around the detected ear, but provides
detailed, pixel-wise information about the location of the ears in the image.
Our experiments on a dataset gathered from the web (a.k.a. in the wild) show
that the proposed technique ensures good detection results in the presence of
various covariate factors and significantly outperforms the existing
state-of-the-art.
Authors' comments: 12 pages
Vijaya Krishna Yalavarthi, Xiangyu Ke, Arijit Khan
Crowdsourcing is becoming increasingly important in entity resolution tasks
due to their inherent complexity such as clustering of images and natural
language processing. Humans can provide more insightful information for these
difficult problems compared to machine-based automatic techniques.
Nevertheless, human workers can make mistakes due to lack of domain expertise
or seriousness, ambiguity, or even due to malicious intents. The
state-of-the-art literature usually deals with human errors via majority voting
or by assigning a universal error rate over crowd workers. However, such
approaches are incomplete, and often inconsistent, because the expertise of
crowd workers are diverse with possible biases, thereby making it largely
inappropriate to assume a universal error rate for all workers over all
crowdsourcing tasks.
To this end, we mitigate the above challenges by considering an uncertain
graph model, where the edge probability between two records A and B denotes the
ratio of crowd workers who voted Yes on the question if A and B are same
entity. In order to reflect independence across different crowdsourcing tasks,
we apply the well-established notion of possible worlds, and develop
parameter-free algorithms both for next crowdsourcing, as well as for entity
resolution problems. In particular, using our framework, the problem of entity
resolution becomes equivalent to finding the maximum-likelihood clustering;
whereas for the next crowdsourcing, we identify the record pair that maximally
increases the reliability of the maximum-likelihood clustering. Based on
detailed empirical analysis over real-world datasets, we find that our proposed
solution, PERC (probabilistic entity resolution with imperfect crowd) improves
the quality by 15% and reduces the overall cost by 50% for the
crowdsourcing-based entity resolution problem.
Authors' comments: 10 Pages, 11 Figures
Aditya A. Shastri, Deepti Tamrakar, Kapil Ahuja
Breast cancer is becoming pervasive with each passing day. Hence, its early
detection is a big step in saving the life of any patient. Mammography is a
common tool in breast cancer diagnosis. The most important step here is
classification of mammogram patches as normal-abnormal and benign-malignant.
Texture of a breast in a mammogram patch plays a significant role in these
classifications. We propose a variation of Histogram of Gradients (HOG) and
Gabor filter combination called Histogram of Oriented Texture (HOT) that
exploits this fact. We also revisit the Pass Band - Discrete Cosine Transform
(PB-DCT) descriptor that captures texture information well. All features of a
mammogram patch may not be useful. Hence, we apply a feature selection
technique called Discrimination Potentiality (DP). Our resulting descriptors,
DP-HOT and DP-PB-DCT, are compared with the standard descriptors.
Density of a mammogram patch is important for classification, and has not
been studied exhaustively. The Image Retrieval in Medical Application (IRMA)
database from RWTH Aachen, Germany is a standard database that provides
mammogram patches, and most researchers have tested their frameworks only on a
subset of patches from this database. We apply our two new descriptors on all
images of the IRMA database for density wise classification, and compare with
the standard descriptors. We achieve higher accuracy than all of the existing
standard descriptors (more than 92%).
Authors' comments: 28 Pages, 8 Figures, and 7 Tables