Wenyu Zhang, Qing Ding, Jian Hu, Yi Ma, Mingzhe Lu
Graph convolutional networks (GCN) is widely used to handle irregular data since it updates node features by using the structure information of graph. With the help of iterated GCN, high-order information can be obtained to further enhance the representation of nodes. However, how to apply GCN to structured data (such as pictures) has not been deeply studied. In this paper, we explore the application of graph attention networks (GAT) in image feature extraction. First of all, we propose a novel graph generation algorithm to convert images into graphs through matrix transformation. It is one magnitude faster than the algorithm based on K Nearest Neighbors (KNN). Then, GAT is used on the generated graph to update the node features. Thus, a more robust representation is obtained. These two steps are combined into a module called pixel-wise graph attention module (PGA). Since the graph obtained by our graph generation algorithm can still be transformed into a picture after processing, PGA can be well combined with CNN. Based on these two modules, we consulted the ResNet and design a pixel-wise graph attention network (PGANet). The PGANet is applied to the task of person re-identification in the datasets Market1501, DukeMTMC-reID and Occluded-DukeMTMC (outperforms state-of-the-art by 0.8\%, 1.1\% and 11\% respectively, in mAP scores). Experiment results show that it achieves the state-of-the-art performance. \href{https://github.com/wenyu1009/PGANet}{The code is available here}.
Christian Herglotz, Sion Grosche, Akarsh Bharadwaj, André Kaup
This paper presents a novel method to estimate the power consumption of
distinct active components on an electronic carrier board by using thermal
imaging. The components and the board can be made of heterogeneous material
such as plastic, coated microchips, and metal bonds or wires, where a special
coating for high emissivity is not required. The thermal images are recorded
when the components on the board are dissipating power. In order to enable
reliable estimates, a segmentation of the thermal image must be available that
can be obtained by manual labeling, object detection methods, or exploiting
layout information. Evaluations show that with low-resolution consumer infrared
cameras and dissipated powers larger than 300mW, mean estimation errors of 10%
can be achieved.
Authors' comments: 10 pages, 8 figures
Christian Bender, Steffen Meyer
We introduce and analyze a family of linear least-squares Monte Carlo schemes
for backward SDEs, which interpolate between the one-step dynamic programming
scheme of Lemor, Warin, and Gobet (Bernoulli, 2006) and the multi-step dynamic
programming scheme of Gobet and Turkedjiev (Mathematics of Computation, 2016).
Our algorithm approximates conditional expectations over segments of the time
grid. We discuss the optimal choice of the segment length depending on the
`smoothness' of the problem and show that, in typical situations, the
complexity can be reduced compared to the state-of-the-art multi-step dynamic
programming scheme.
Authors' comments: 35 pages
Javier Salazar Cavazos, Jeffrey A. Fessler, Laura Balzano
Principal component analysis (PCA) is a key tool in the field of data
dimensionality reduction that is useful for various data science problems.
However, many applications involve heterogeneous data that varies in quality
due to noise characteristics associated with different sources of the data.
Methods that deal with this mixed dataset are known as heteroscedastic methods.
Current methods like HePPCAT make Gaussian assumptions of the basis
coefficients that may not hold in practice. Other methods such as Weighted PCA
(WPCA) assume the noise variances are known, which may be difficult to know in
practice. This paper develops a PCA method that can estimate the sample-wise
noise variances and use this information in the model to improve the estimate
of the subspace basis associated with the low-rank structure of the data. This
is done without distributional assumptions of the low-rank component and
without assuming the noise variances are known. Simulations show the
effectiveness of accounting for such heteroscedasticity in the data, the
benefits of using such a method with all of the data versus retaining only good
data, and comparisons are made against other PCA methods established in the
literature like PCA, Robust PCA (RPCA), and HePPCAT. Code available at
https://github.com/javiersc1/ALPCAH
Authors' comments: This article has been accepted for publication in the Fourteenth
International Conference on Sampling Theory and Applications, accessible via
IEEE XPlore. See DOI section
Wenting Tang, Xingxing Wei, Bo Li
Structured network pruning is a practical approach to reduce computation cost directly while retaining the CNNs' generalization performance in real applications. However, identifying redundant filters is a core problem in structured network pruning, and current redundancy criteria only focus on individual filters' attributes. When pruning sparsity increases, these redundancy criteria are not effective or efficient enough. Since the filter-wise interaction also contributes to the CNN's prediction accuracy, we integrate the filter-wise interaction into the redundancy criterion. In our criterion, we introduce the filter importance and filter utilization strength to reflect the decision ability of individual and multiple filters. Utilizing this new redundancy criterion, we propose a structured network pruning approach SNPFI (Structured Network Pruning by measuring Filter-wise Interaction). During the pruning, the SNPFI can automatically assign the proper sparsity based on the filter utilization strength and eliminate the useless filters by filter importance. After the pruning, the SNPFI can recover pruned model's performance effectively without iterative training by minimizing the interaction difference. We empirically demonstrate the effectiveness of the SNPFI with several commonly used CNN models, including AlexNet, MobileNetv1, and ResNet-50, on various image classification datasets, including MNIST, CIFAR-10, and ImageNet. For all experimental CNN models, nearly 60% of computation is reduced in a network compression while the classification accuracy remains.
Runshi Tang, Ming Yuan, Anru R. Zhang
This paper introduces a novel framework called Mode-wise Principal Subspace
Pursuit (MOP-UP) to extract hidden variations in both the row and column
dimensions for matrix data. To enhance the understanding of the framework, we
introduce a class of matrix-variate spiked covariance models that serve as
inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm
consists of two steps: Average Subspace Capture (ASC) and Alternating
Projection (AP). These steps are specifically designed to capture the row-wise
and column-wise dimension-reduced subspaces which contain the most informative
features of the data. ASC utilizes a novel average projection operator as
initialization and achieves exact recovery in the noiseless setting. We analyze
the convergence and non-asymptotic error bounds of MOP-UP, introducing a
blockwise matrix eigenvalue perturbation bound that proves the desired bound,
where classic perturbation bounds fail. The effectiveness and practical merits
of the proposed framework are demonstrated through experiments on both
simulated and real datasets. Lastly, we discuss generalizations of our approach
to higher-order data.
Authors' comments: Journal of the Royal Statistical Society, Series B, to appear
Xingxing Wei, Shiji Zhao
Adversarial examples have attracted widespread attention in security-critical applications because of their transferability across different models. Although many methods have been proposed to boost adversarial transferability, a gap still exists between capabilities and practical demand. In this paper, we argue that the model-specific discriminative regions are a key factor causing overfitting to the source model, and thus reducing the transferability to the target model. For that, a patch-wise mask is utilized to prune the model-specific regions when calculating adversarial perturbations. To accurately localize these regions, we present a learnable approach to automatically optimize the mask. Specifically, we simulate the target models in our framework, and adjust the patch-wise mask according to the feedback of the simulated models. To improve the efficiency, the differential evolutionary (DE) algorithm is utilized to search for patch-wise masks for a specific image. During iterative attacks, the learned masks are applied to the image to drop out the patches related to model-specific regions, thus making the gradients more generic and improving the adversarial transferability. The proposed approach is a preprocessing method and can be integrated with existing methods to further boost the transferability. Extensive experiments on the ImageNet dataset demonstrate the effectiveness of our method. We incorporate the proposed approach with existing methods to perform ensemble attacks and achieve an average success rate of 93.01% against seven advanced defense methods, which can effectively enhance the state-of-the-art transfer-based attack performance.
Kovi Rose, Joshua Pritchard, Tara Murphy, Manisha Caleb, Dougal Dobie, Laura Driessen, Stefan W. Duchesne, David L. Kaplan et al.
We present the detection of rotationally modulated, circularly polarized
radio emission from the T8 brown dwarf WISE J062309.94-045624.6 between 0.9 and
2.0 GHz. We detected this high proper motion ultracool dwarf with the
Australian SKA Pathfinder in $1.36$ GHz imaging data from the Rapid ASKAP
Continuum Survey. We observed WISE J062309.94-045624.6 to have a time and
frequency averaged Stokes I flux density of $4.17\pm0.41$ mJy beam$^{-1}$, with
an absolute circular polarization fraction of $66.3\pm9.0\%$, and calculated a
specific radio luminosity of $L_{\nu}\sim10^{14.8}$ erg s$^{-1}$ Hz$^{-1}$. In
follow-up observations with the Australian Telescope Compact Array and MeerKAT
we identified a multi-peaked pulse structure, used dynamic spectra to place a
lower limit of $B>0.71$ kG on the dwarf's magnetic field, and measured a
$P=1.912\pm0.005$ h periodicity which we concluded to be due to rotational
modulation. The luminosity and period we measured are comparable to those of
other ultracool dwarfs observed at radio wavelengths. This implies that future
megahertz to gigahertz surveys, with increased cadence and improved
sensitivity, are likely to detect similar or later-type dwarfs. Our detection
of WISE J062309.94-045624.6 makes this dwarf the coolest and latest-type star
observed to produce radio emission.
Authors' comments: Accepted for publication in ApJ Letters; 11 pages, 3 figures and 2
tables
Lucile Ter-Minassian, Oscar Clivio, Karla Diaz-Ordaz, Robin J. Evans, Chris Holmes
Predictive black-box models can exhibit high accuracy but their opaque nature hinders their uptake in safety-critical deployment environments. Explanation methods (XAI) can provide confidence for decision-making through increased transparency. However, existing XAI methods are not tailored towards models in sensitive domains where one predictor is of special interest, such as a treatment effect in a clinical model, or ethnicity in policy models. We introduce Path-Wise Shapley effects (PWSHAP), a framework for assessing the targeted effect of a binary (e.g.~treatment) variable from a complex outcome model. Our approach augments the predictive model with a user-defined directed acyclic graph (DAG). The method then uses the graph alongside on-manifold Shapley values to identify effects along causal pathways whilst maintaining robustness to adversarial attacks. We establish error bounds for the identified path-wise Shapley effects and for Shapley values. We show PWSHAP can perform local bias and mediation analyses with faithfulness to the model. Further, if the targeted variable is randomised we can quantify local effect modification. We demonstrate the resolution, interpretability, and true locality of our approach on examples and a real-world experiment.
Michael Loibl, Leonardo Leonetti, Alessandro Reali, Josef Kiendl
This work presents an efficient quadrature rule for shell analysis fully integrated in CAD by means of Isogeometric Analysis (IGA). General CAD-models may consist of trimmed parts such as holes, intersections, cut-offs etc. Therefore, IGA should be able to deal with these models in order to fulfil its promise of closing the gap between design and analysis. Trimming operations violate the tensor-product structure of the used Non-Uniform Rational B-spline (NURBS) basis functions and of typical quadrature rules. Existing efficient patch-wise quadrature rules consider actual knot vectors and are determined in 1D. They are extended to further dimensions by means of a tensor-product. Therefore, they are not directly applicable to trimmed structures. The herein proposed method extends patch-wise quadrature rules to trimmed surfaces. Thereby, the number of quadrature points can be signifficantly reduced. Geometrically linear and non-linear benchmarks of plane, plate and shell structures are investigated. The results are compared to a standard trimming procedure and a good performance is observed.
Seungjin Jung, Seungmo Seo, Yonghyun Jeong, Jongwon Choi
The class-wise training losses often diverge as a result of the various
levels of intra-class and inter-class appearance variation, and we find that
the diverging class-wise training losses cause the uncalibrated prediction with
its reliability. To resolve the issue, we propose a new calibration method to
synchronize the class-wise training losses. We design a new training loss to
alleviate the variance of class-wise training losses by using multiple
class-wise scaling factors. Since our framework can compensate the training
losses of overfitted classes with those of under-fitted classes, the integrated
training loss is preserved, preventing the performance drop even after the
model calibration. Furthermore, our method can be easily employed in the
post-hoc calibration methods, allowing us to use the pre-trained model as an
initial model and reduce the additional computation for model calibration. We
validate the proposed framework by employing it in the various post-hoc
calibration methods, which generally improves calibration performance while
preserving accuracy, and discover through the investigation that our approach
performs well with unbalanced datasets and untuned hyperparameters.
Authors' comments: Published at ICML 2023. Camera ready version
Lin Li, Jianing Qiu, Michael Spratling
Deep neural networks are vulnerable to adversarial examples. Adversarial
training (AT) is an effective defense against adversarial examples. However, AT
is prone to overfitting which degrades robustness substantially. Recently, data
augmentation (DA) was shown to be effective in mitigating robust overfitting if
appropriately designed and optimized for AT. This work proposes a new method to
automatically learn online, instance-wise, DA policies to improve robust
generalization for AT. This is the first automated DA method specific for
robustness. A novel policy learning objective, consisting of Vulnerability,
Affinity and Diversity, is proposed and shown to be sufficiently effective and
efficient to be practical for automatic DA generation during AT. Importantly,
our method dramatically reduces the cost of policy search from the 5000 hours
of AutoAugment and the 412 hours of IDBH to 9 hours, making automated DA more
practical to use for adversarial robustness. This allows our method to
efficiently explore a large search space for a more effective DA policy and
evolve the policy as training progresses. Empirically, our method is shown to
outperform all competitive DA methods across various model architectures and
datasets. Our DA policy reinforced vanilla AT to surpass several
state-of-the-art AT methods regarding both accuracy and robustness. It can also
be combined with those advanced AT methods to further boost robustness. Code
and pre-trained models are available at https://github.com/TreeLLi/AROID.
Authors' comments: published at the IJCV in press
Asim Naveed, Syed S. Naqvi, Tariq M. Khan, Imran Razzak
Skin cancer holds the highest incidence rate among all cancers globally. The importance of early detection cannot be overstated, as late-stage cases can be lethal. Classifying skin lesions, however, presents several challenges due to the many variations they can exhibit, such as differences in colour, shape, and size, significant variation within the same class, and notable similarities between different classes. This paper introduces a novel class-wise attention technique that equally regards each class while unearthing more specific details about skin lesions. This attention mechanism is progressively used to amalgamate discriminative feature details from multiple scales. The introduced technique demonstrated impressive performance, surpassing more than 15 cutting-edge methods including the winners of HAM1000 and ISIC 2019 leaderboards. It achieved an impressive accuracy rate of 97.40% on the HAM10000 dataset and 94.9% on the ISIC 2019 dataset.
Zachary Robertson, Oluwasanmi Koyejo
In the quest to enhance the efficiency and bio-plausibility of training deep
neural networks, Feedback Alignment (FA), which replaces the backward pass
weights with random matrices in the training process, has emerged as an
alternative to traditional backpropagation. While the appeal of FA lies in its
circumvention of computational challenges and its plausible biological
alignment, the theoretical understanding of this learning rule remains partial.
This paper uncovers a set of conservation laws underpinning the learning
dynamics of FA, revealing intriguing parallels between FA and Gradient Descent
(GD). Our analysis reveals that FA harbors implicit biases akin to those
exhibited by GD, challenging the prevailing narrative that these learning
algorithms are fundamentally different. Moreover, we demonstrate that these
conservation laws elucidate sufficient conditions for layer-wise alignment with
feedback matrices in ReLU networks. We further show that this implies
over-parameterized two-layer linear networks trained with FA converge to
minimum-norm solutions. The implications of our findings offer avenues for
developing more efficient and biologically plausible alternatives to
backpropagation through an understanding of the principles governing learning
dynamics in deep networks.
Authors' comments: 8 pages, 2 figures
Tobias Cord-Landwehr, Christoph Boeddeker, Cătălin Zorilă, Rama Doddipatla, Reinhold Haeb-Umbach
Using a Teacher-Student training approach we developed a speaker embedding
extraction system that outputs embeddings at frame rate. Given this high
temporal resolution and the fact that the student produces sensible speaker
embeddings even for segments with speech overlap, the frame-wise embeddings
serve as an appropriate representation of the input speech signal for an
end-to-end neural meeting diarization (EEND) system. We show in experiments
that this representation helps mitigate a well-known problem of EEND systems:
when increasing the number of speakers the diarization performance drop is
significantly reduced. We also introduce block-wise processing to be able to
diarize arbitrarily long meetings.
Authors' comments: ICASSP 2023
Karim Lounici, Grégoire Pacreau
Large datasets are often affected by cell-wise outliers in the form of missing or erroneous data. However, discarding any samples containing outliers may result in a dataset that is too small to accurately estimate the covariance matrix. Moreover, the robust procedures designed to address this problem require the invertibility of the covariance operator and thus are not effective on high-dimensional data. In this paper, we propose an unbiased estimator for the covariance in the presence of missing values that does not require any imputation step and still achieves near minimax statistical accuracy with the operator norm. We also advocate for its use in combination with cell-wise outlier detection methods to tackle cell-wise contamination in a high-dimensional and low-rank setting, where state-of-the-art methods may suffer from numerical instability and long computation times. To complement our theoretical findings, we conducted an experimental study which demonstrates the superiority of our approach over the state of the art both in low and high dimension settings.
Edmund Dervakos, Konstantinos Thomas, Giorgos Filandrianos, Giorgos Stamou
Counterfactual explanations have been argued to be one of the most intuitive
forms of explanation. They are typically defined as a minimal set of edits on a
given data sample that, when applied, changes the output of a model on that
sample. However, a minimal set of edits is not always clear and understandable
to an end-user, as it could, for instance, constitute an adversarial example
(which is indistinguishable from the original data sample to an end-user).
Instead, there are recent ideas that the notion of minimality in the context of
counterfactuals should refer to the semantics of the data sample, and not to
the feature space. In this work, we build on these ideas, and propose a
framework that provides counterfactual explanations in terms of knowledge
graphs. We provide an algorithm for computing such explanations (given some
assumptions about the underlying knowledge), and quantitatively evaluate the
framework with a user study.
Authors' comments: To appear at IJCAI 2023
Shuai Wang, Zipei Yan, Daoan Zhang, Zhongsen Li, Sirui Wu, Wenxuan Chen, Rui Li
Deep neural networks (DNNs) achieve promising performance in visual
recognition under the independent and identically distributed (IID) hypothesis.
In contrast, the IID hypothesis is not universally guaranteed in numerous
real-world applications, especially in medical image analysis. Medical image
segmentation is typically formulated as a pixel-wise classification task in
which each pixel is classified into a category. However, this formulation
ignores the hard-to-classified pixels, e.g., some pixels near the boundary
area, as they usually confuse DNNs. In this paper, we first explore that
hard-to-classified pixels are associated with high uncertainty. Based on this,
we propose a novel framework that utilizes uncertainty estimation to highlight
hard-to-classified pixels for DNNs, thereby improving its generalization. We
evaluate our method on two popular benchmarks: prostate and fundus datasets.
The results of the experiment demonstrate that our method outperforms
state-of-the-art methods.
Authors' comments: 10 pages, 3 figures
Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu
As emerging hardware begins to support mixed bit-width arithmetic computation, mixed-precision quantization is widely used to reduce the complexity of neural networks. However, Vision Transformers (ViTs) require complex self-attention computation to guarantee the learning of powerful feature representations, which makes mixed-precision quantization of ViTs still challenging. In this paper, we propose a novel patch-wise mixed-precision quantization (PMQ) for efficient inference of ViTs. Specifically, we design a lightweight global metric, which is faster than existing methods, to measure the sensitivity of each component in ViTs to quantization errors. Moreover, we also introduce a pareto frontier approach to automatically allocate the optimal bit-precision according to the sensitivity. To further reduce the computational complexity of self-attention in inference stage, we propose a patch-wise module to reallocate bit-width of patches in each layer. Extensive experiments on the ImageNet dataset shows that our method greatly reduces the search cost and facilitates the application of mixed-precision quantization to ViTs.
Raúl Vargas, Lenny A. Romero, Song Zhang, Andres G. Marrugo
This Letter presents a novel structured light system model that effectively
considers local lens distortion by pixel-wise rational functions. We leverage
the stereo method for initial calibration and then estimate the rational model
for each pixel. Our proposed model can achieve high measurement accuracy within
and outside the calibration volume, demonstrating its robustness and accuracy.
Authors' comments: 4 pages, 5 figures