Talha Bin Masood, Ingrid Hotz
The analysis of contours of scalar fields plays an important role in visualization. For example the contour tree and contour statistics can be used as a means for interaction and filtering or as signatures. In the context of tensor field analysis, such methods are also interesting for the analysis of derived scalar invariants. While there are standard algorithms to compute and analyze contours, they are not directly applicable to tensor invariants when using component-wise tensor interpolation. In this chapter we present an accurate derivation of the contour spectrum for invariants with quadratic behavior computed from two-dimensional piece-wise linear tensor fields. For this work, we are mostly motivated by a consistent treatment of the anisotropy field, which plays an important role as stability measure for tensor field topology. We show that it is possible to derive an analytical expression for the distribution of the invariant values in this setting, which is exemplary given for the anisotropy in all details. Our derivation is based on a topological sub-division of the mesh in triangles that exhibit a monotonic behavior. This triangulation can also directly be used to compute the accurate contour tree with standard algorithms. We compare the results to a na\"ive approach based on linear interpolation on the original mesh or the subdivision.
Weizhe Liu, Mathieu Salzmann, Pascal Fua
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density. While effective, deep learning approaches are vulnerable to adversarial attacks, which, in a crowd-counting context, can lead to serious security issues. However, attack and defense mechanisms have been virtually unexplored in regression tasks, let alone for crowd density estimation. In this paper, we investigate the effectiveness of existing attack strategies on crowd-counting networks, and introduce a simple yet effective pixel-wise detection mechanism. It builds on the intuition that, when attacking a multitask network, in our case estimating crowd density and scene depth, both outputs will be perturbed, and thus the second one can be used for detection purposes. We will demonstrate that this significantly outperforms heuristic and uncertainty-based strategies.
Fu-Zhao Ou, Yuan-Gen Wang, Jin Li, Guopu Zhu, Sam Kwong
No-reference image quality assessment (NR-IQA) has received increasing attention in the IQA community since reference image is not always available. Real-world images generally suffer from various types of distortion. Unfortunately, existing NR-IQA methods do not work with all types of distortion. It is a challenging task to develop universal NR-IQA that has the ability of evaluating all types of distorted images. In this paper, we propose a universal NR-IQA method based on controllable list-wise ranking (CLRIQA). First, to extend the authentically distorted image dataset, we present an imaging-heuristic approach, in which the over-underexposure is formulated as an inverse of Weber-Fechner law, and fusion strategy and probabilistic compression are adopted, to generate the degraded real-world images. These degraded images are label-free yet associated with quality ranking information. We then design a controllable list-wise ranking function by limiting rank range and introducing an adaptive margin to tune rank interval. Finally, the extended dataset and controllable list-wise ranking function are used to pre-train a CNN. Moreover, in order to obtain an accurate prediction model, we take advantage of the original dataset to further fine-tune the pre-trained network. Experiments evaluated on four benchmark datasets (i.e. LIVE, CSIQ, TID2013, and LIVE-C) show that the proposed CLRIQA improves the state of the art by over 9% in terms of overall performance. The code and model are publicly available at https://github.com/GZHU-Image-Lab/CLRIQA.
Hyunsung D. Jun, Roberto J. Assef, Franz E. Bauer, Andrew W. Blain, Tanio Diaz-Santos, Peter R. Eisenhardt, Daniel Stern, Chao-Wei Tsai et al.
We present VLT/XSHOOTER rest-frame UV-optical spectra of 10 Hot Dust-Obscured
Galaxies (Hot DOGs) at $z\sim2$ to investigate AGN diagnostics and to assess
the presence and effect of ionized gas outflows. Most Hot DOGs in this sample
are narrow-line dominated AGN (type 1.8 or higher), and have higher Balmer
decrements than typical type 2 quasars. Almost all (8/9) sources show evidence
for ionized gas outflows in the form of broad and blueshifted [O III] profiles,
and some sources have such profiles in H$\alpha$ (5/7) or [O II] (3/6).
Combined with the literature, these results support additional sources of
obscuration beyond the simple torus invoked by AGN unification models. Outflow
rates derived from the broad [O III] line ($\rm
\gtrsim10^{3}\,M_{\odot}\,yr^{-1}$) are greater than the black hole accretion
and star formation rates, with feedback efficiencies ($\sim0.1-1\%$) consistent
with negative feedback to the host galaxy's star formation in merger-driven
quasar activity scenarios. We find the broad emission lines in luminous,
obscured quasars are often better explained by outflows within the narrow line
region, and caution that black hole mass estimates for such sources in the
literature may have substantial uncertainty. Regardless, we find lower bounds
on the Eddington ratio for Hot DOGs near unity.
Authors' comments: 20 pages, accepted to ApJ, minor corrections (typos and references)
in section 4.4
Shaohuai Shi, Zhenheng Tang, Qiang Wang, Kaiyong Zhao, Xiaowen Chu
To reduce the long training time of large deep neural network (DNN) models,
distributed synchronous stochastic gradient descent (S-SGD) is commonly used on
a cluster of workers. However, the speedup brought by multiple workers is
limited by the communication overhead. Two approaches, namely pipelining and
gradient sparsification, have been separately proposed to alleviate the impact
of communication overheads. Yet, the gradient sparsification methods can only
initiate the communication after the backpropagation, and hence miss the
pipelining opportunity. In this paper, we propose a new distributed
optimization method named LAGS-SGD, which combines S-SGD with a novel
layer-wise adaptive gradient sparsification (LAGS) scheme. In LAGS-SGD, every
worker selects a small set of "significant" gradients from each layer
independently whose size can be adaptive to the communication-to-computation
ratio of that layer. The layer-wise nature of LAGS-SGD opens the opportunity of
overlapping communications with computations, while the adaptive nature of
LAGS-SGD makes it flexible to control the communication time. We prove that
LAGS-SGD has convergence guarantees and it has the same order of convergence
rate as vanilla S-SGD under a weak analytical assumption. Extensive experiments
are conducted to verify the analytical assumption and the convergence
performance of LAGS-SGD. Experimental results on a 16-GPU cluster show that
LAGS-SGD outperforms the original S-SGD and existing sparsified S-SGD without
losing obvious model accuracy.
Authors' comments: 8 pages. To appear at ECAI 2020
Xiang Gao, Wei Hu, Guo-Jun Qi
Recent advances in Graph Convolutional Neural Networks (GCNNs) have shown their efficiency for non-Euclidean data on graphs, which often require a large amount of labeled data with high cost. It it thus critical to learn graph feature representations in an unsupervised manner in practice. To this end, we propose a novel unsupervised learning of Graph Transformation Equivariant Representations (GraphTER), aiming to capture intrinsic patterns of graph structure under both global and local transformations. Specifically, we allow to sample different groups of nodes from a graph and then transform them node-wise isotropically or anisotropically. Then, we self-train a representation encoder to capture the graph structures by reconstructing these node-wise transformations from the feature representations of the original and transformed graphs. In experiments, we apply the learned GraphTER to graphs of 3D point cloud data, and results on point cloud segmentation/classification show that GraphTER significantly outperforms state-of-the-art unsupervised approaches and pushes greatly closer towards the upper bound set by the fully supervised counterparts. The code is available at: https://github.com/gyshgx868/graph-ter.
Rafael Díaz Hernández Rojas, Giorgio Parisi, Federico Ricci-Tersenghi
Jamming is a phenomenon shared by a wide variety of systems, such as granular
materials, foams, and glasses in their high density regime. This has motivated
the development of a theoretical framework capable of explaining many of their
static critical properties with a unified approach. However the dynamics
occurring in the vicinity of the jamming point has received little attention
and the problem of finding a connection with the local structure of the
configuration remains unexplored. Here we address this issue by constructing
physically well defined structural variables using the information contained in
the network of contacts of jammed configurations, and then showing that such
variables yield a resilient statistical description of the particle-wise
dynamics near this critical point. Our results are based on extensive numerical
simulations of systems of spherical particles that allow us to statistically
characterize the trajectories of individual particles in terms of their first
two moments. We first demonstrate that, besides displaying a broad distribution
of mobilities, particles may also have preferential directions of motion. Next,
we associate each of these features with a structural variable computed
uniquely in terms of the contact vectors at jamming, obtaining considerably
high statistical correlations. The robustness of our approach is confirmed by
testing two types of dynamical protocols, namely Molecular Dynamics and Monte
Carlo, with different types of interaction. We also provide evidence that the
dynamical regime we study here is dominated by anharmonic effects and therefore
it cannot be described properly in terms of vibrational modes. Finally, we show
that correlations decay slowly and in an interaction-independent fashion,
suggesting a universal rate of information loss.
Authors' comments: Same as published version; better figures placement
Mengzhuo Guo, Zhongzhi Xu, Qingpeng Zhang, Xiuwu Liao, Jiapeng Liu
Ordinal regression predicts the objects' labels that exhibit a natural ordering, which is important to many managerial problems such as credit scoring and clinical diagnosis. In these problems, the ability to explain how the attributes affect the prediction is critical to users. However, most, if not all, existing ordinal regression models simplify such explanation in the form of constant coefficients for the main and interaction effects of individual attributes. Such explanation cannot characterize the contributions of attributes at different value scales. To address this challenge, we propose a new explainable ordinal regression model, namely, the Explainable Ordinal Factorization Model (XOFM). XOFM uses the piece-wise linear functions to approximate the actual contributions of individual attributes and their interactions. Moreover, XOFM introduces a novel ordinal transformation process to assign each object the probabilities of belonging to multiple relevant classes, instead of fixing boundaries to differentiate classes. XOFM is based on the Factorization Machines to handle the potential sparsity problem as a result of discretizing the attribute scales. Comprehensive experiments with benchmark datasets and baseline models demonstrate that the proposed XOFM exhibits superior explainability and leads to state-of-the-art prediction accuracy.
Yoshiki Toba, Satoshi Yamada, Yoshihiro Ueda, Claudio Ricci, Yuichi Terashima, Tohru Nagao, Wei-Hao Wang, Atsushi Tanimoto et al.
We report the discovery of a Compton-thick (CT) dust-obscured galaxy (DOG) at
$z$ = 0.89, WISE J082501.48+300257.2 (WISE0825+3002), observed by Nuclear
Spectroscopic Telescope Array (NuSTAR). X-ray analysis with the XCLUMPY model
revealed that hard X-ray luminosity in the rest-frame 2-10 keV band of
WISE0825+3002 is $L_{\rm X}$ (2-10 keV) = $4.2^{+2.8}_{-1.6} \times 10^{44}$
erg s$^{-1}$ while its hydrogen column density is $N_{\rm H}$ =
$1.0^{+0.8}_{-0.4} \times 10^{24}$ cm$^{-2}$, indicating that WISE0825+3002 is
a mildly CT active galactic nucleus (AGN). We performed the spectral energy
distribution (SED) fitting with CIGALE to derive its stellar mass, star
formation rate, and infrared luminosity. The estimated Eddington ratio based on
stellar mass and integration of the best-fit SED of AGN component is
$\lambda_{\rm Edd}$ = 0.70, which suggests that WISE0825+3002 harbors an
actively growing black hole behind a large amount of gas and dust. We found
that the relationship between luminosity ratio of X-ray and 6 $\mu$m, and
Eddington ratio follows an empirical relation for AGNs reported by Toba et al.
(2019a).
Authors' comments: 10 pages, 7 figures, and 2 tables, accepted for publication in ApJ
Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, Jian Sun
In this work, we present a novel data-driven method for robust 6DoF object
pose estimation from a single RGBD image. Unlike previous methods that directly
regressing pose parameters, we tackle this challenging task with a
keypoint-based approach. Specifically, we propose a deep Hough voting network
to detect 3D keypoints of objects and then estimate the 6D pose parameters
within a least-squares fitting manner. Our method is a natural extension of
2D-keypoint approaches that successfully work on RGB based 6DoF estimation. It
allows us to fully utilize the geometric constraint of rigid objects with the
extra depth information and is easy for a network to learn and optimize.
Extensive experiments were conducted to demonstrate the effectiveness of
3D-keypoint detection in the 6D pose estimation task. Experimental results also
show our method outperforms the state-of-the-art methods by large margins on
several benchmarks. Code and video are available at
https://github.com/ethnhe/PVN3D.git.
Authors' comments: Accepted to Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, 2020. (CVPR 2020)
Anis Elgabli, Jihong Park, Sabbir Ahmed, Mehdi Bennis
This article proposes a communication-efficient decentralized deep learning
algorithm, coined layer-wise federated group ADMM (L-FGADMM). To minimize an
empirical risk, every worker in L-FGADMM periodically communicates with two
neighbors, in which the periods are separately adjusted for different layers of
its deep neural network. A constrained optimization problem for this setting is
formulated and solved using the stochastic version of GADMM proposed in our
prior work. Numerical evaluations show that by less frequently exchanging the
largest layer, L-FGADMM can significantly reduce the communication cost,
without compromising the convergence speed. Surprisingly, despite less
exchanged information and decentralized operations, intermittently skipping the
largest layer consensus in L-FGADMM creates a regularizing effect, thereby
achieving the test accuracy as high as federated learning (FL), a baseline
method with the entire layer consensus by the aid of a central entity.
Authors' comments: 6 pages; 4 figures; presented at IEEE WCNC'2020
Haoming Jiang, Chen Liang, Chong Wang, Tuo Zhao
Many multi-domain neural machine translation (NMT) models achieve knowledge transfer by enforcing one encoder to learn shared embedding across domains. However, this design lacks adaptation to individual domains. To overcome this limitation, we propose a novel multi-domain NMT model using individual modules for each domain, on which we apply word-level, adaptive and layer-wise domain mixing. We first observe that words in a sentence are often related to multiple domains. Hence, we assume each word has a domain proportion, which indicates its domain preference. Then word representations are obtained by mixing their embedding in individual domains based on their domain proportions. We show this can be achieved by carefully designing multi-head dot-product attention modules for different domains, and eventually taking weighted averages of their parameters by word-level layer-wise domain proportions. Through this, we can achieve effective domain knowledge sharing, and capture fine-grained domain-specific knowledge as well. Our experiments show that our proposed model outperforms existing ones in several NMT tasks.
Ahmed Ben Saad, Youssef Tamaazousti, Josselin Kherroubi, Alexis He
We tackle the problem of texture inpainting where the input images are textures with missing values along with masks that indicate the zones that should be generated. Many works have been done in image inpainting with the aim to achieve global and local consistency. But these works still suffer from limitations when dealing with textures. In fact, the local information in the image to be completed needs to be used in order to achieve local continuities and visually realistic texture inpainting. For this, we propose a new segmentor discriminator that performs a patch-wise real/fake classification and is supervised by input masks. During training, it aims to locate the fake and thus backpropagates consistent signal to the generator. We tested our approach on the publicly available DTD dataset and showed that it achieves state-of-the-art performances and better deals with local consistency than existing methods.
Dino Ienco, Roberto Interdonato, Raffaele Gaetano
Recurrent Neural Networks (RNNs) can be seriously impacted by the initial parameters assignment, which may result in poor generalization performances on new unseen data. With the objective to tackle this crucial issue, in the context of RNN based classification, we propose a new supervised layer-wise pretraining strategy to initialize network parameters. The proposed approach leverages a data-aware strategy that sets up a taxonomy of classification problems automatically derived by the model behavior. To the best of our knowledge, despite the great interest in RNN-based classification, this is the first data-aware strategy dealing with the initialization of such models. The proposed strategy has been tested on four benchmarks coming from two different domains, i.e., Speech Recognition and Remote Sensing. Results underline the significance of our approach and point out that data-aware strategies positively support the initialization of Recurrent Neural Network based classification models.
Zhirui Chen, Jianheng Li, Wei-Shi Zheng
The scalability problem caused by the difficulty in annotating Person Re-identification(Re-ID) datasets has become a crucial bottleneck in the development of Re-ID.To address this problem, many unsupervised Re-ID methods have recently been proposed.Nevertheless, most of these models require transfer from another auxiliary fully supervised dataset, which is still expensive to obtain.In this work, we propose a Re-ID model based on Weakly Supervised Tracklets(WST) data from various camera views, which can be inexpensively acquired by combining the fragmented tracklets of the same person in the same camera view over a period of time.We formulate our weakly supervised tracklets Re-ID model by a novel method, named deep feature-wise mutual learning(DFML), which consists of Mutual Learning on Feature Extractors (MLFE) and Mutual Learning on Feature Classifiers (MLFC).We propose MLFE by leveraging two feature extractors to learn from each other to extract more robust and discriminative features.On the other hand, we propose MLFC by adapting discriminative features from various camera views to each classifier. Extensive experiments demonstrate the superiority of our proposed DFML over the state-of-the-art unsupervised models and even some supervised models on three Re-ID benchmark datasets.
Adam K. Leroy, Karin M. Sandstrom, Dustin Lang, Alexia Lewis, Samir Salim, Erica A. Behrens, Jérémy Chastenet, I-Da Chiang et al.
We present an atlas of ultraviolet and infrared images of ~15,750 local (d <
50 Mpc) galaxies, as observed by NASA's WISE and GALEX missions. These maps
have matched resolution (FWHM 7.5'' and 15''), matched astrometry, and a common
procedure for background removal. We demonstrate that they agree well with
resolved intensity measurements and integrated photometry from previous
surveys. This atlas represents the first part of a program (the z=0
Multi-wavelength Galaxy Synthesis) to create a large, uniform database of
resolved measurements of gas and dust in nearby galaxies. The images and
associated catalogs are publicly available at the NASA/IPAC Infrared Science
Archive. This atlas allows us estimate local and integrated star formation
rates (SFRs) and stellar masses (M$_\star$) across the local galaxy population
in a uniform way. In the appendix, we use the population synthesis fits of
Salim et al. (2016, 2018) to calibrate integrated M$_\star$ and SFR estimators
based on GALEX and WISE. Because they leverage an SDSS-base training set of
>100,000 galaxies, these calibrations have high precision and allow us to
rigorously compare local galaxies to Sloan Digital Sky Survey results. We
provide these SFR and M$_\star$ estimates for all galaxies in our sample and
show that our results yield a "main sequence" of star forming galaxies
comparable to previous work. We also show the distribution of intensities from
resolved galaxies in NUV-to-WISE1 vs. WISE1-to-WISE3 space, which captures much
of the key physics accessed by these bands.
Authors' comments: 46 pages, 27 figures, published in ApJS
(https://ui.adsabs.harvard.edu/abs/2019ApJS..244...24L/abstract ). See that
version for full resolution figures and machine readable tables. Go download
data for your favorite nearby galaxy here:
https://irsa.ipac.caltech.edu/data/WISE/z0MGS/overview.html . The appendix
presents detailed analysis of translations to physical quantities
J. Chae, S. -N. Hong
We propose a novel greedy algorithm for the support recovery of a sparse signal from a small number of noisy measurements. In the proposed method, a new support index is identified for each iteration based on bit-wise maximum a posteriori (B-MAP) detection. This is optimal in the sense of detecting one of the remaining support indices, provided that all the detected indices in the previous iterations are correct. Despite its optimality, it requires an expensive complexity for computing the maximization metric (i.e., a posteriori probability of each remaining support) due to the marginalization of high-dimensional sparse vector. We address this problem by presenting a good proxy (named B-MAP proxy) on the maximization metric which is accurate enough to find the maximum index, rather than an exact probability, Moreover, it is easily evaluated only using vector correlations as in orthogonal matching pursuit (OMP), but the use completely different proxy matrices for maximization. We demonstrate that the proposed B-MAP detection provides a significant gain compared with the existing methods as OMP and MAP-OMP, having the same complexity. Subsequently, we construct the advanced greedy algorithms, based on B-MAP proxy, by leveraging the idea of compressive sampling matching pursuit (CoSaMP) and subspace pursuit (SP). Via simulations, we show that the proposed method outperforms also OMP and MAP-OMP under the frameworks of the advanced greedy algorithms.
Mina Basirat, Peter M. Roth
Deep neural networks paved the way for significant improvements in image
visual categorization during the last years. However, even though the tasks are
highly varying, differing in complexity and difficulty, existing solutions
mostly build on the same architectural decisions. This also applies to the
selection of activation functions (AFs), where most approaches build on
Rectified Linear Units (ReLUs). In this paper, however, we show that the choice
of a proper AF has a significant impact on the classification accuracy, in
particular, if fine, subtle details are of relevance. Therefore, we propose to
model the degree of absence and the presence of features via the AF by using
piece-wise linear functions, which we refer to as L*ReLU. In this way, we can
ensure the required properties, while still inheriting the benefits in terms of
computational efficiency from ReLUs. We demonstrate our approach for the task
of Fine-grained Visual Categorization (FGVC), running experiments on seven
different benchmark datasets. The results do not only demonstrate superior
results but also that for different tasks, having different characteristics,
different AFs are selected.
Authors' comments: Accepted: Winter Conference on Applications of Computer Vision (WACV)
2020
T. H. Jarrett, M. E. Cluver, M. J. I. Brown, D. A. Dale, C. W. Tsai, F. Masci
We present mid-infrared photometry and measured global properties of the 100
largest galaxies in the sky, including the Magellanic Clouds, Local Group
galaxies M31 and M33, the Fornax and Virgo Galaxy Cluster giants, and many of
the most spectacular Messier objects (e.g., M51 and M83). This is the first
release of a larger catalog of extended sources as imaged in the mid-infrared,
called the WISE Extended Source Catalogue (WXSC). In this study we measure
their global attributes, including integrated flux, surface brightness and
radial distribution. The largest of the large are the LMC, SMC and the
Andromeda Galaxy, which are also the brightest mid-infrared galaxies in the
sky. We interrogate the large galaxies using WISE colors, which serve as
proxies for four general types of galaxies: bulge-dominated spheroidals,
intermediate semi-quiescent disks, star-forming spirals, and AGN-dominated. The
colors reveal a tight "sequence" that spans 5 magnitudes in W2-W3 color,
ranging from early to late-types, and low to high star-forming activity; we fit
the functional form given by: ${\rm (W1-W2)} = [0.015 \times {\rm e}^{
\frac{{\rm (W2-W3)}}{1.38} }] - 0.08$. Departures from this sequence may reveal
nuclear, starburst, and merging events. Physical properties and luminosity
attributes are computed, notably the diameter, aggregate stellar mass and the
dust-obscured star formation activity. We introduce the 'pinwheel' diagram
which depicts physical properties with respect to the median value observed for
WISE galaxies in the local universe. Utilized with the WXSC, this diagram will
delineate between different kinds of galaxies, identifying those with similar
star formation and structural properties. Finally, we present the mid-infrared
photometry of the 25 brightest globular clusters in the sky, including Omega
Centauri, 47 Tucanae and a number of famed night-sky targets (e.g. M 13).
(Abridged)
Authors' comments: 45 pages, 25 figures, 6 tables. Accepted for publication in ApJS.
High quality graphics, tables and ancillary material are available at the
following URL: https://vislab.idia.ac.za/research
Yihui He, Jianing Qian, Jianren Wang, Cindy X. Le, Congrui Hetang, Qi Lyu, Wenping Wang, Tianwei Yue
Very deep convolutional neural networks (CNNs) have been firmly established as the primary methods for many computer vision tasks. However, most state-of-the-art CNNs are large, which results in high inference latency. Recently, depth-wise separable convolution has been proposed for image recognition tasks on computationally limited platforms such as robotics and self-driving cars. Though it is much faster than its counterpart, regular convolution, accuracy is sacrificed. In this paper, we propose a novel decomposition approach based on SVD, namely depth-wise decomposition, for expanding regular convolutions into depthwise separable convolutions while maintaining high accuracy. We show our approach can be further generalized to the multi-channel and multi-layer cases, based on Generalized Singular Value Decomposition (GSVD) [59]. We conduct thorough experiments with the latest ShuffleNet V2 model [47] on both random synthesized dataset and a large-scale image recognition dataset: ImageNet [10]. Our approach outperforms channel decomposition [73] on all datasets. More importantly, our approach improves the Top-1 accuracy of ShuffleNet V2 by ~2%.