Kenneth W. Shum, Hanxu Hou
A novel implementation of a special class of Galois ring, in which the
multiplication can be realized by a cyclic convolution, is applied to the
construction of network codes. The primitive operations involved are byte-wise
shifts and integer additions modulo a power of 2. Both of them can be executed
efficiently in microprocessors. An illustration of how to apply this idea to
array code is given at the end of the paper.
Authors' comments: Accepted for presentation in ISIT2020
C. A. Theissen, D. C. Bardalez Gagliuffi, J. K. Faherty, J. Gagne, A. J. Burgasser
We present a parallax solution for WISE J135501.90-825838.9, a spectral
binary with spectral types L7+T7.5 and candidate AB Doradus member. Using
$WISE$ astrometry, we obtain a distance of $d = 16.7\pm5.3$ pc. This
preliminary parallax solution provides further evidence that WISE
J135501.90-825838.9 is a member of AB Doradus (130-200 Myr), and when combined
with evolutionary models predicts masses of 11 $M_\mathrm{Jup}$ and 9
$M_\mathrm{Jup}$ for both components.
Authors' comments: Submitted to RNAAS
Yin Tang, Qi Teng, Lei Zhang, Fuhong Min, Jun He
Recently, convolutional neural networks (CNNs) have set latest
state-of-the-art on various human activity recognition (HAR) datasets. However,
deep CNNs often require more computing resources, which limits their
applications in embedded HAR. Although many successful methods have been
proposed to reduce memory and FLOPs of CNNs, they often involve special network
architectures designed for visual tasks, which are not suitable for deep HAR
tasks with time series sensor signals, due to remarkable discrepancy.
Therefore, it is necessary to develop lightweight deep models to perform HAR.
As filter is the basic unit in constructing CNNs, it deserves further research
whether re-designing smaller filters is applicable for deep HAR. In the paper,
inspired by the idea, we proposed a lightweight CNN using Lego filters for HAR.
A set of lower-dimensional filters is used as Lego bricks to be stacked for
conventional filters, which does not rely on any special network structure. The
local loss function is used to train model. To our knowledge, this is the first
paper that proposes lightweight CNN for HAR in ubiquitous and wearable
computing arena. The experiment results on five public HAR datasets, UCI-HAR
dataset, OPPORTUNITY dataset, UNIMIB-SHAR dataset, PAMAP2 dataset, and WISDM
dataset collected from either smartphones or multiple sensor nodes, indicate
that our novel Lego CNN with local loss can greatly reduce memory and
computation cost over CNN, while achieving higher accuracy. That is to say, the
proposed model is smaller, faster and more accurate. Finally, we evaluate the
actual performance on an Android smartphone.
Authors' comments: 11 pages, 11 figures
Alessandro Ilic Mezza, Emanuël A. P. Habets, Meinard Müller, Augusto Sarti
The performance of machine learning algorithms is known to be negatively
affected by possible mismatches between training (source) and test (target)
data distributions. In fact, this problem emerges whenever an acoustic scene
classification system which has been trained on data recorded by a given device
is applied to samples acquired under different acoustic conditions or captured
by mismatched recording devices. To address this issue, we propose an
unsupervised domain adaptation method that consists of aligning the first- and
second-order sample statistics of each frequency band of target-domain acoustic
scenes to the ones of the source-domain training dataset. This model-agnostic
approach is devised to adapt audio samples from unseen devices before they are
fed to a pre-trained classifier, thus avoiding any further learning phase.
Using the DCASE 2018 Task 1-B development dataset, we show that the proposed
method outperforms the state-of-the-art unsupervised methods found in the
literature in terms of both source- and target-domain classification accuracy.
Authors' comments: 5 pages, 1 figure, 3 tables, submitted to EUSIPCO 2020
Tengteng Zhang, Yiqin Yu, Jing Mei, Zefang Tang, Xiang Zhang, Shaochun Li
The PICO framework (Population, Intervention, Comparison, and Outcome) is
usually used to formulate evidence in the medical domain. The major task of
PICO extraction is to extract sentences from medical literature and classify
them into each class. However, in most circumstances, there will be more than
one evidences in an extracted sentence even it has been categorized to a
certain class. In order to address this problem, we propose a step-wise disease
Named Entity Recognition (DNER) extraction and PICO identification method. With
our method, sentences in paper title and abstract are first classified into
different classes of PICO, and medical entities are then identified and
classified into P and O. Different kinds of deep learning frameworks are used
and experimental results show that our method will achieve high performance and
fine-grained extraction results comparing with conventional PICO extraction
works.
Authors' comments: 9 pages, 3 figures
Haoxuan Jiang, Jianghui Ji, Liangliang Yu
In this work, we investigate the size, thermal inertia, surface roughness and
geometric albedo of 10 Vesta family asteroids by using the Advanced
Thermophysical Model (ATPM), based on the thermal infrared data acquired by
mainly NASA's Wide-field Infrared Survey Explorer (WISE). Here we show that the
average thermal inertia and geometric albedo of the investigated Vesta family
members are 42 $\rm J m^{-2} s^{-1/2} K^{-1}$ and 0.314, respectively, where
the derived effective diameters are less than 10 km. Moreover, the family
members have a relatively low roughness fraction on their surfaces. The
similarity in thermal inertia and geometric albedo among the V-type Vesta
family member may reveal their close connection in the origin and evolution. As
the fragments of the cratering event of Vesta, the family members may have
undergone similar evolution process, thereby leading to very close thermal
properties. Finally, we estimate their regolith grain sizes with different
volume filling factors.
Authors' comments: 29 pages, 40 figures, accepted for publication in AJ
Yuning You, Tianlong Chen, Zhangyang Wang, Yang Shen
Graph convolution networks (GCN) are increasingly popular in many
applications, yet remain notoriously hard to train over large graph datasets.
They need to compute node representations recursively from their neighbors.
Current GCN training algorithms suffer from either high computational costs
that grow exponentially with the number of layers, or high memory usage for
loading the entire graph and node embeddings. In this paper, we propose a novel
efficient layer-wise training framework for GCN (L-GCN), that disentangles
feature aggregation and feature transformation during training, hence greatly
reducing time and memory complexities. We present theoretical analysis for
L-GCN under the graph isomorphism framework, that L-GCN leads to as powerful
GCNs as the more costly conventional training algorithm does, under mild
conditions. We further propose L$^2$-GCN, which learns a controller for each
layer that can automatically adjust the training epochs per layer in L-GCN.
Experiments show that L-GCN is faster than state-of-the-arts by at least an
order of magnitude, with a consistent of memory usage not dependent on dataset
size, while maintaining comparable prediction performance. With the learned
controller, L$^2$-GCN can further cut the training time in half. Our codes are
available at https://github.com/Shen-Lab/L2-GCN.
Authors' comments: Supplementary materials are available at
https://yyou1996.github.io/files/cvpr2020_l2gcn_supplement.pdf. CVPR 2020
Alireza M. Javid, Arun Venkitaraman, Mikael Skoglund, Saikat Chatterjee
We design a ReLU-based multilayer neural network by mapping the feature
vectors to a higher dimensional space in every layer. We design the weight
matrices in every layer to ensure a reduction of the training cost as the
number of layers increases. Linear projection to the target in the higher
dimensional space leads to a lower training cost if a convex cost is minimized.
An $\ell_2$-norm convex constraint is used in the minimization to reduce the
generalization error and avoid overfitting. The regularization hyperparameters
of the network are derived analytically to guarantee a monotonic decrement of
the training cost, and therefore, it eliminates the need for cross-validation
to find the regularization hyperparameter in each layer. We show that the
proposed architecture is norm-preserving and provides an invertible feature
vector, and therefore, can be used to reduce the training cost of any other
learning method which employs linear projection to estimate the target.
Authors' comments: 2020 EURASIP Journal on Advances in Signal Processing
Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu
We address weakly-supervised video actor-action segmentation (VAAS), which
extends general video object segmentation (VOS) to additionally consider action
labels of the actors. The most successful methods on VOS synthesize a pool of
pseudo-annotations (PAs) and then refine them iteratively. However, they face
challenges as to how to select from a massive amount of PAs high-quality ones,
how to set an appropriate stop condition for weakly-supervised training, and
how to initialize PAs pertaining to VAAS. To overcome these challenges, we
propose a general Weakly-Supervised framework with a Wise Selection of training
samples and model evaluation criterion (WS^2). Instead of blindly trusting
quality-inconsistent PAs, WS^2 employs a learning-based selection to select
effective PAs and a novel region integrity criterion as a stopping condition
for weakly-supervised training. In addition, a 3D-Conv GCAM is devised to adapt
to the VAAS task. Extensive experiments show that WS^2 achieves
state-of-the-art performance on both weakly-supervised VOS and VAAS tasks and
is on par with the best fully-supervised method on VAAS.
Authors' comments: 11 pages, 8 figures, cvpr-2020 supplementary video:
https://youtu.be/CX1hEOV9tlo
Mohammad Etemad, Zahra Etemad, Amilcar Soares, Vania Bogorny, Stan Matwin, Luis Torgo
Large amounts of mobility data are being generated from many different sources, and several data mining methods have been proposed for this data. One of the most critical steps for trajectory data mining is segmentation. This task can be seen as a pre-processing step in which a trajectory is divided into several meaningful consecutive sub-sequences. This process is necessary because trajectory patterns may not hold in the entire trajectory but on trajectory parts. In this work, we propose a supervised trajectory segmentation algorithm, called Wise Sliding Window Segmentation (WS-II). It processes the trajectory coordinates to find behavioral changes in space and time, generating an error signal that is further used to train a binary classifier for segmenting trajectory data. This algorithm is flexible and can be used in different domains. We evaluate our method over three real datasets from different domains (meteorology, fishing, and individuals movements), and compare it with four other trajectory segmentation algorithms: OWS, GRASP-UTS, CB-SMoT, and SPD. We observed that the proposed algorithm achieves the highest performance for all datasets with statistically significant differences in terms of the harmonic mean of purity and coverage.
Giulia Bassignana, Jennifer Fransson, Vincent Henry, Olivier Colliot, Violetta Zujovic, Fabrizio De Vico Fallani
Identifying the nodes that have the potential to influence the state of a network is a relevant question for many complex systems. In many applications it is often essential to test the ability of an individual node to control a specific target subset of the network. In biological networks, this might provide precious information on how single genes regulate the expression of specific groups of molecules in the cell. Taking into account these constraints, we propose an optimized heuristic based on the Kalman rank condition to quantify the centrality of a node as the number of target nodes it can control. By introducing a hierarchy among the nodes in the target set, and performing a step-wise research, we ensure for sparse and directed networks the identification of a controllable driver-target configuration in a significantly reduced space and time complexity. We show how the method works for simple network configurations, then we use it to characterize the inflammatory pathways in molecular gene networks associated with macrophage dysfunction in patients with multiple sclerosis. Results indicate that the targeted secreted molecules can in general be controlled by a large number of driver nodes (51%) involved in different cell functions, i.e. sensing, signaling and transcription. However, during the inflammatory response only a moderate fraction of all the possible driver-target pairs are significantly coactivated, as measured by gene expression data obtained from human blood samples. Notably, they differ between multiple sclerosis patients and healthy controls, and we find that this is related to the presence of dysregulated genes along the controllable walks. Our method, that we name step-wise target controllability, represents a practical solution to identify controllable driver-target configurations in directed complex networks and test their relevance from a functional perspective.
Shuxin Wang, Shilei Cao, Dong Wei, Renzhen Wang, Kai Ma, Liansheng Wang, Deyu Meng, Yefeng Zheng
We introduce a one-shot segmentation method to alleviate the burden of manual
annotation for medical images. The main idea is to treat one-shot segmentation
as a classical atlas-based segmentation problem, where voxel-wise
correspondence from the atlas to the unlabelled data is learned. Subsequently,
segmentation label of the atlas can be transferred to the unlabelled data with
the learned correspondence. However, since ground truth correspondence between
images is usually unavailable, the learning system must be well-supervised to
avoid mode collapse and convergence failure. To overcome this difficulty, we
resort to the forward-backward consistency, which is widely used in
correspondence problems, and additionally learn the backward correspondences
from the warped atlases back to the original atlas. This cycle-correspondence
learning design enables a variety of extra, cycle-consistency-based supervision
signals to make the training process stable, while also boost the performance.
We demonstrate the superiority of our method over both deep learning-based
one-shot segmentation methods and a classical multi-atlas segmentation method
via thorough experiments.
Authors' comments: Accepted to Proc. IEEE Conf. Computer Vision and Pattern Recognition
2020
Krzysztof Debicki, Lanpeng Ji, Tomasz Rolski
We derive the exact asymptotics of \[ P\left( \sup_{t\ge 0} \Bigl( X_1(t) - \mu_1 t\Bigr)> u, \ \sup_{s\ge 0} \Bigl( X_2(s) - \mu_2 s\Bigr)> u \right), \ \ u\to\infty, \] where $(X_1(t),X_2(s))_{t,s\ge0}$ is a correlated two-dimensional Brownian motion with correlation $\rho\in[-1,1]$ and $\mu_1,\mu_2>0$. It appears that the play between $\rho$ and $\mu_1,\mu_2$ leads to several types of asymptotics. Although the exponent in the asymptotics as a function of $\rho$ is continuous, one can observe different types of prefactor functions depending on the range of $\rho$, which constitute a phase-type transition phenomena.
Sana Tonekaboni, Shalmali Joshi, Kieran Campbell, David Duvenaud, Anna Goldenberg
Explanations of time series models are useful for high stakes applications like healthcare but have received little attention in machine learning literature. We propose FIT, a framework that evaluates the importance of observations for a multivariate time-series black-box model by quantifying the shift in the predictive distribution over time. FIT defines the importance of an observation based on its contribution to the distributional shift under a KL-divergence that contrasts the predictive distribution against a counterfactual where the rest of the features are unobserved. We also demonstrate the need to control for time-dependent distribution shifts. We compare with state-of-the-art baselines on simulated and real-world clinical data and demonstrate that our approach is superior in identifying important time points and observations throughout the time series.
Marcus C. Christiansen, Christian Furrer
In the presence of monotone information, the stochastic Thiele equation describing the dynamics of state-wise prospective reserves is closely related to the classic martingale representation theorem. When the information utilized by the insurer is non-monotone, the classic martingale theory does not apply. By taking an infinitesimal approach, we derive a generalized stochastic Thiele equation that allows for information discarding. En passant, we solve some open problems for the classic case of monotone information. The results and their implication in practice are illustrated via examples where information is discarded upon and after stochastic retirement.
C. Lazzoni, R. Gratton, J. M. Alcalà, S. Desidera, A. Frasca, C. F. Manara, D. Mesa, E. Rigliaco et al.
Very recently, a second companion on wider orbit has been discovered around
GQ Lup. This is a low-mass accreting star partially obscured by a disk seen at
high inclination. If detected, this disk may be compared to the known disk
around the primary. We detected this disk on archive HST and WISE data. The
extended spectral energy distribution provided by these data confirms the
presence of accretion from Halpha emission and UV excess, and shows an IR
excess attributable to a warm disk. In addition, we resolved the disk on the
HST images. This is found to be roughly aligned with the disk of the primary.
Both of them are roughly aligned with the Lupus I dust filament containing GQ
Lup.
Authors' comments: 5 pages, 4 figures
Hang Zhang, Jinwei Zhang, Qihao Zhang, Jeremy Kim, Shun Zhang, Susan A. Gauthier, Pascal Spincemaille, Thanh D. Nguyen et al.
Brain lesion volume measured on T2 weighted MRI images is a clinically
important disease marker in multiple sclerosis (MS). Manual delineation of MS
lesions is a time-consuming and highly operator-dependent task, which is
influenced by lesion size, shape and conspicuity. Recently, automated lesion
segmentation algorithms based on deep neural networks have been developed with
promising results. In this paper, we propose a novel recurrent slice-wise
attention network (RSANet), which models 3D MRI images as sequences of slices
and captures long-range dependencies through a recurrent manner to utilize
contextual information of MS lesions. Experiments on a dataset with 43 patients
show that the proposed method outperforms the state-of-the-art approaches. Our
implementation is available online at https://github.com/tinymilky/RSANet.
Authors' comments: Accepted for publication in MICCAI 2019
Lei Huang, Jie Qin, Li Liu, Fan Zhu, Ling Shao
Conditioning analysis uncovers the landscape of an optimization objective by
exploring the spectrum of its curvature matrix. This has been well explored
theoretically for linear models. We extend this analysis to deep neural
networks (DNNs) in order to investigate their learning dynamics. To this end,
we propose layer-wise conditioning analysis, which explores the optimization
landscape with respect to each layer independently. Such an analysis is
theoretically supported under mild assumptions that approximately hold in
practice. Based on our analysis, we show that batch normalization (BN) can
stabilize the training, but sometimes result in the false impression of a local
minimum, which has detrimental effects on the learning. Besides, we
experimentally observe that BN can improve the layer-wise conditioning of the
optimization problem. Finally, we find that the last linear layer of a very
deep residual network displays ill-conditioned behavior. We solve this problem
by only adding one BN layer before the last linear layer, which achieves
improved performance over the original and pre-activation residual networks.
Authors' comments: Accepted to ECCV 2020. The code is available at:
https://github.com/huangleiBuaa/LayerwiseCA
Mathilde Guillemot, Catherine Heusele, Rodolphe Korichi, Sylvianne Schnebert, Liming Chen
The lack of transparency of neural networks stays a major break for their use. The Layerwise Relevance Propagation technique builds heat-maps representing the relevance of each input in the model s decision. The relevance spreads backward from the last to the first layer of the Deep Neural Network. Layer-wise Relevance Propagation does not manage normalization layers, in this work we suggest a method to include normalization layers. Specifically, we build an equivalent network fusing normalization layers and convolutional or fully connected layers. Heatmaps obtained with our method on MNIST and CIFAR 10 datasets are more accurate for convolutional layers. Our study also prevents from using Layerwise Relevance Propagation with networks including a combination of connected layers and normalization layer.
L. R. Bedin, C. Fontanive
In the second paper of this series we perfected our method of linking high
precision Hubble Space Telescope astrometry to the high-accuracy Gaia DR2
absolute reference system to overcome the limitations of relative astrometry
with narrow-field cameras. Our test case here is the Y brown dwarf WISE
J163940.83-684738.6, observed at different epochs spread over a 6-yr time
baseline with the Infra-Red channel of the Wide Field Camera 3. We derived
significantly improved astrometric parameters compared to previous
determinations, finding: (mu_RAcosDc,mu_DC,parallax) =
(577.21+/-0.24mas/yr,-3108.39+/-0.27mas/yr,210.4+/-1.8mas). In particular, our
derived absolute parallax corresponds to a distance of 4.75+/-0.05pc for the
faint ultracool dwarf.
Authors' comments: 10 pages, 3 tables, 3 figures (fig.1 at low resolution). Accepted for
publication in MNRAS on 2020 February 21