Haonan Wang, Peng Cao, Jiaqi Wang, Osmar R. Zaiane
Most recent semantic segmentation methods adopt a U-Net framework with an
encoder-decoder architecture. It is still challenging for U-Net with a simple
skip connection scheme to model the global multi-scale context: 1) Not each
skip connection setting is effective due to the issue of incompatible feature
sets of encoder and decoder stage, even some skip connection negatively
influence the segmentation performance; 2) The original U-Net is worse than the
one without any skip connection on some datasets. Based on our findings, we
propose a new segmentation framework, named UCTransNet (with a proposed CTrans
module in U-Net), from the channel perspective with attention mechanism.
Specifically, the CTrans module is an alternate of the U-Net skip connections,
which consists of a sub-module to conduct the multi-scale Channel Cross fusion
with Transformer (named CCT) and a sub-module Channel-wise Cross-Attention
(named CCA) to guide the fused multi-scale channel-wise information to
effectively connect to the decoder features for eliminating the ambiguity.
Hence, the proposed connection consisting of the CCT and CCA is able to replace
the original skip connection to solve the semantic gaps for an accurate
automatic medical image segmentation. The experimental results suggest that our
UCTransNet produces more precise segmentation performance and achieves
consistent improvements over the state-of-the-art for semantic segmentation
across different datasets and conventional architectures involving transformer
or U-shaped framework. Code: https://github.com/McGregorWwww/UCTransNet.
Authors' comments: Accepted by AAAI 2022. Code is available at
https://github.com/McGregorWwww/UCTransNet
Ramiz Aktar, Li Xue, Tong Liu
We examine the properties of spiral shocks from a steady, adiabatic,
non-axisymmetric accretion disk around a compact star in binary. We first time
incorporate all the possible influences from binary through adopting the Roche
potential and Coriolis forces in the basic conservation equations. In this
paper, we assume the spiral shocks to be point-wise self-similar, and the flow
is in vertical hydrostatic equilibrium to simplify the study. We also
investigate the mass outflow due to the shock compression and apply it to the
accreting white dwarf in binary. We find that our model will be beneficial to
overcome the ad hoc assumption of optically thick wind generally used in the
studies of the progenitor of supernovae Ia.
Authors' comments: 17 pages, 7 figures, 1 appendix. Accepted for publication in ApJ
Gushu Li, Anbang Wu, Yunong Shi, Ali Javadi-Abhari, Yufei Ding, Yuan Xie
The quantum simulation kernel is an important subroutine appearing as a very long gate sequence in many quantum programs. In this paper, we propose Paulihedral, a block-wise compiler framework that can deeply optimize this subroutine by exploiting high-level program structure and optimization opportunities. Paulihedral first employs a new Pauli intermediate representation that can maintain the high-level semantics and constraints in quantum simulation kernels. This naturally enables new large-scale optimizations that are hard to implement at the low gate-level. In particular, we propose two technology-independent instruction scheduling passes, and two technology-dependent code optimization passes which reconcile the circuit synthesis, gate cancellation, and qubit mapping stages of the compiler. Experimental results show that Paulihedral can outperform state-of-the-art compiler infrastructures in a wide-range of applications on both near-term superconducting quantum processors and future fault-tolerant quantum computers.
Janine Witte, Ronja Foraita, Vanessa Didelez
Causal discovery algorithms estimate causal graphs from observational data.
This can provide a valuable complement to analyses focussing on the causal
relation between individual treatment-outcome pairs. Constraint-based causal
discovery algorithms rely on conditional independence testing when building the
graph. Until recently, these algorithms have been unable to handle missing
values. In this paper, we investigate two alternative solutions: Test-wise
deletion and multiple imputation. We establish necessary and sufficient
conditions for the recoverability of causal structures under test-wise
deletion, and argue that multiple imputation is more challenging in the context
of causal discovery than for estimation. We conduct an extensive comparison by
simulating from benchmark causal graphs: As one might expect, we find that
test-wise deletion and multiple imputation both clearly outperform list-wise
deletion and single imputation. Crucially, our results further suggest that
multiple imputation is especially useful in settings with a small number of
either Gaussian or discrete variables, but when the dataset contains a mix of
both neither method is uniformly best. The methods we compare include random
forest imputation and a hybrid procedure combining test-wise deletion and
multiple imputation. An application to data from the IDEFICS cohort study on
diet- and lifestyle-related diseases in European children serves as an
illustrating example.
Authors' comments: 38 pages, 11 figures
Ke Wang, Jonathan I Tamir, Alfredo De Goyeneche, Uri Wollner, Rafi Brada, Stella Yu, Michael Lustig
Purpose: To improve reconstruction fidelity of fine structures and textures
in deep learning (DL) based reconstructions.
Methods: A novel patch-based Unsupervised Feature Loss (UFLoss) is proposed
and incorporated into the training of DL-based reconstruction frameworks in
order to preserve perceptual similarity and high-order statistics. The UFLoss
provides instance-level discrimination by mapping similar instances to similar
low-dimensional feature vectors and is trained without any human annotation. By
adding an additional loss function on the low-dimensional feature space during
training, the reconstruction frameworks from under-sampled or corrupted data
can reproduce more realistic images that are closer to the original with finer
textures, sharper edges, and improved overall image quality. The performance of
the proposed UFLoss is demonstrated on unrolled networks for accelerated 2D and
3D knee MRI reconstruction with retrospective under-sampling. Quantitative
metrics including NRMSE, SSIM, and our proposed UFLoss were used to evaluate
the performance of the proposed method and compare it with others.
Results: In-vivo experiments indicate that adding the UFLoss encourages
sharper edges and more faithful contrasts compared to traditional and
learning-based methods with pure l2 loss. More detailed textures can be seen in
both 2D and 3D knee MR images. Quantitative results indicate that
reconstruction with UFLoss can provide comparable NRMSE and a higher SSIM while
achieving a much lower UFLoss value.
Conclusion: We present UFLoss, a patch-based unsupervised learned feature
loss, which allows the training of DL-based reconstruction to obtain more
detailed texture, finer features, and sharper edges with higher overall image
quality under DL-based reconstruction frameworks.
Authors' comments: 35 pages, 13 figures
Jakob Filser, Karsten Reuter, Harald Oberhofer
The multipole-expansion (MPE) model is an implicit solvation model used to
efficiently incorporate solvent effects in quantum chemistry. Even within the
recent direct approach, the multipole basis used in MPE to express the
dielectric response still solves the electrostatic problem inefficiently or not
at all for solutes larger than $\approx 10$ non-hydrogen atoms. In existing MPE
parameterizations, the resulting systematic underestimation of the
electrostatic solute-solvent interaction is presently compensated for by a
systematic overestimation of non-electrostatic attractive interactions. Even
though the MPE model can thus reproduce experimental free energies of solvation
of small molecules remarkably well, the inherent error cancellation makes it
hard to assign physical meaning to the individual free energy terms in the
model, raising concerns about transferability. Here, we resolve this issue by
solving the electrostatic problem piece-wise in 3D regions centered around all
non-hydrogen nuclei of the solute, ensuring reliable convergence of the
multipole series. The resulting method, which we call MPE-$n$c, thus allows for
a much improved reproduction of the dielectric response of a medium to a
solute. Employing a reduced non-electrostatic model with a single free
parameter, in addition to the density isovalue defining the solvation cavity,
MPE-$n$c yields free energies of solvation of neutral, anionic and cationic
solutes in water in good agreement with experiment.
Authors' comments: Journal of Chemical Theory and Computation, Accepted for Publication
Tejas Dastane, Varun Rao, Kartik Shenoy, Devendra Vyavaharkar
This paper presents a novel technique for skin colour segmentation that
overcomes the limitations faced by existing techniques such as Colour Range
Thresholding. Skin colour segmentation is affected by the varied skin colours
and surrounding lighting conditions, leading to poorskin segmentation for many
techniques. We propose a new two stage Pixel Neighbourhood technique that
classifies any pixel as skin or non-skin based on its neighbourhood pixels. The
first step calculates the probability of each pixel being skin by passing HSV
values of the pixel to a Deep Neural Network model. In the next step, it
calculates the likeliness of pixel being skin using these probabilities of
neighbouring pixels. This technique performs skin colour segmentation better
than the existing techniques.
Authors' comments: 5 pages
Keyang Wang, Lei Zhang, Wenli Song, Qinghai Lang, Lingyun Qin
The anchor-based detectors handle the problem of scale variation by building
the feature pyramid and directly setting different scales of anchors on each
cell in different layers. However, it is difficult for box-wise anchors to
guide the adaptive learning of scale-specific features in each layer because
there is no one-to-one correspondence between box-wise anchors and pixel-level
features. In order to alleviate the problem, in this paper, we propose a
scale-customized weak segmentation (SCWS) block at the pixel level for scale
customized object feature learning in each layer. By integrating the SCWS
blocks into the single-shot detector, a scale-aware object detector (SCOD) is
constructed to detect objects of different sizes naturally and accurately.
Furthermore, the standard location loss neglects the fact that the hard and
easy samples may be seriously imbalanced. A forthcoming problem is that it is
unable to get more accurate bounding boxes due to the imbalance. To address
this problem, an adaptive IoU (AIoU) loss via a simple yet effective squeeze
operation is specified in our SCOD. Extensive experiments on PASCAL VOC and MS
COCO demonstrate the superiority of our SCOD.
Authors' comments: To appear in IEEE International Conference on Image Processing 2021
Naser Damer, Noemie Spiller, Meiling Fang, Fadi Boutros, Florian Kirchbuchner, Arjan Kuijper
A face morphing attack image can be verified to multiple identities, making
this attack a major vulnerability to processes based on identity verification,
such as border checks. Various methods have been proposed to detect face
morphing attacks, however, with low generalizability to unexpected
post-morphing processes. A major post-morphing process is the print and scan
operation performed in many countries when issuing a passport or identity
document. In this work, we address this generalization problem by adapting a
pixel-wise supervision approach where we train a network to classify each pixel
of the image into an attack or not, rather than only having one label for the
whole image. Our pixel-wise morphing attack detection (PW-MAD) solution proved
to perform more accurately than a set of established baselines. More
importantly, PW-MAD shows high generalizability in comparison to related works,
when evaluated on unknown re-digitized attacks. Additionally to our PW-MAD
approach, we create a new face morphing attack dataset with digital and
re-digitized samples, namely the LMA-DRD dataset that is publicly available for
research purposes upon request.
Authors' comments: Accepted at the 16th International Symposium on Visual Computing
(ISVC 2021)
Md Amirul Islam, Matthew Kowal, Sen Jia, Konstantinos G. Derpanis, Neil D. B. Bruce
In this paper, we challenge the common assumption that collapsing the spatial
dimensions of a 3D (spatial-channel) tensor in a convolutional neural network
(CNN) into a vector via global pooling removes all spatial information.
Specifically, we demonstrate that positional information is encoded based on
the ordering of the channel dimensions, while semantic information is largely
not. Following this demonstration, we show the real world impact of these
findings by applying them to two applications. First, we propose a simple yet
effective data augmentation strategy and loss function which improves the
translation invariance of a CNN's output. Second, we propose a method to
efficiently determine which channels in the latent representation are
responsible for (i) encoding overall position information or (ii)
region-specific positions. We first show that semantic segmentation has a
significant reliance on the overall position channels to make predictions. We
then show for the first time that it is possible to perform a `region-specific'
attack, and degrade a network's performance in a particular part of the input.
We believe our findings and demonstrated applications will benefit research
areas concerned with understanding the characteristics of CNNs.
Authors' comments: ICCV 2021
Shubham Maheshwari, Khushbu Pahwa, Tavpritesh Sethi
Structure learning offers an expressive, versatile and explainable approach to causal and mechanistic modeling of complex biological data. We present wiseR, an open source application for learning, evaluating and deploying robust causal graphical models using graph neural networks and Bayesian networks. We demonstrate the utility of this application through application on for biomarker discovery in a COVID-19 clinical dataset.
Mingcheng Chen, Zhenghui Wang, Zhiyun Zhao, Weinan Zhang, Xiawei Guo, Jian Shen, Yanru Qu, Jieli Lu et al.
Diabetes prediction is an important data science application in the social
healthcare domain. There exist two main challenges in the diabetes prediction
task: data heterogeneity since demographic and metabolic data are of different
types, data insufficiency since the number of diabetes cases in a single
medical center is usually limited. To tackle the above challenges, we employ
gradient boosting decision trees (GBDT) to handle data heterogeneity and
introduce multi-task learning (MTL) to solve data insufficiency. To this end,
Task-wise Split Gradient Boosting Trees (TSGB) is proposed for the multi-center
diabetes prediction task. Specifically, we firstly introduce task gain to
evaluate each task separately during tree construction, with a theoretical
analysis of GBDT's learning objective. Secondly, we reveal a problem when
directly applying GBDT in MTL, i.e., the negative task gain problem. Finally,
we propose a novel split method for GBDT in MTL based on the task gain
statistics, named task-wise split, as an alternative to standard feature-wise
split to overcome the mentioned negative task gain problem. Extensive
experiments on a large-scale real-world diabetes dataset and a commonly used
benchmark dataset demonstrate TSGB achieves superior performance against
several state-of-the-art methods. Detailed case studies further support our
analysis of negative task gain problems and provide insightful findings. The
proposed TSGB method has been deployed as an online diabetes risk assessment
software for early diagnosis.
Authors' comments: 11 pages (2 pages of supplementary), 10 figures, 7 tables. Accepted
by ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2021)
Aditya Saini, Ranjitha Prasad
Albeit the tremendous performance improvements in designing complex
artificial intelligence (AI) systems in data-intensive domains, the black-box
nature of these systems leads to the lack of trustworthiness. Post-hoc
interpretability methods explain the prediction of a black-box ML model for a
single instance, and such explanations are being leveraged by domain experts to
diagnose the underlying biases of these models. Despite their efficacy in
providing valuable insights, existing approaches fail to deliver consistent and
reliable explanations. In this paper, we propose an active learning-based
technique called UnRAvEL (Uncertainty driven Robust Active Learning Based
Locally Faithful Explanations), which consists of a novel acquisition function
that is locally faithful and uses uncertainty-driven sampling based on the
posterior distribution on the probabilistic locality using Gaussian process
regression(GPR). We present a theoretical analysis of UnRAvEL by treating it as
a local optimizer and analyzing its regret in terms of instantaneous regrets
over a global optimizer. We demonstrate the efficacy of the local samples
generated by UnRAvEL by incorporating different kernels such as the Matern and
linear kernels in GPR. Through a series of experiments, we show that UnRAvEL
outperforms the baselines with respect to stability and local fidelity on
several real-world models and datasets. We show that UnRAvEL is an efficient
surrogate dataset generator by deriving importance scores on this surrogate
dataset using sparse linear models. We also showcase the sample efficiency and
flexibility of the developed framework on the Imagenet dataset using a
pre-trained ResNet model.
Authors' comments: To be published in the main track of AIES'22
Slavche Pejoski, Zoran Hadzi-Velkov, Robert Schober
We propose a novel transmission protocol for harvest-then-transmit wireless powered communication networks, which takes into account the non-linearity of the energy harvesting (EH) process at the EH users and maximizes the sum rate in the uplink. We assume a piece-wise linear energy harvesting model and provide expressions for the optimal transmit power of the base station (BS), the duration of the EH phase, and the duration of the uplink information transmission phases of the users. The obtained solution provides insight regarding the significance of the non-linear EH model on the optimal resource allocation. Simulations unveil the growing impact of the saturation effect, which occurs for high received radio frequency powers, as the average and the maximum instantaneous transmit powers of the BS increase.
Chenyu You, Yuan Zhou, Ruihan Zhao, Lawrence Staib, James S. Duncan
Automated segmentation in medical image analysis is a challenging task that
requires a large amount of manually labeled data. However, most existing
learning-based approaches usually suffer from limited manually annotated
medical data, which poses a major practical problem for accurate and robust
medical image segmentation. In addition, most existing semi-supervised
approaches are usually not robust compared with the supervised counterparts,
and also lack explicit modeling of geometric structure and semantic
information, both of which limit the segmentation accuracy. In this work, we
present SimCVD, a simple contrastive distillation framework that significantly
advances state-of-the-art voxel-wise representation learning. We first describe
an unsupervised training strategy, which takes two views of an input volume and
predicts their signed distance maps of object boundaries in a contrastive
objective, with only two independent dropout as mask. This simple approach
works surprisingly well, performing on the same level as previous fully
supervised methods with much less labeled data. We hypothesize that dropout can
be viewed as a minimal form of data augmentation and makes the network robust
to representation collapse. Then, we propose to perform structural distillation
by distilling pair-wise similarities. We evaluate SimCVD on two popular
datasets: the Left Atrial Segmentation Challenge (LA) and the NIH pancreas CT
dataset. The results on the LA dataset demonstrate that, in two types of
labeled ratios (i.e., 20% and 10%), SimCVD achieves an average Dice score of
90.85% and 89.03% respectively, a 0.91% and 2.22% improvement compared to
previous best results. Our method can be trained in an end-to-end fashion,
showing the promise of utilizing SimCVD as a general framework for downstream
tasks, such as medical image synthesis, enhancement, and registration.
Authors' comments: IEEE Transactions on Medical Imaging (IEEE-TMI) 2022
Sebastián Donoso, Lei Jin, Alejandro Maass, Yixiao Qiao
We study directional mean dimension of $\mathbb{Z}^k$-actions (where $k$ is a
positive integer). On the one hand, we show that there is a
$\mathbb{Z}^2$-action whose directional mean dimension (considered as a
$[0,+\infty]$-valued function on the torus) is not continuous. On the other
hand, we prove that if a $\mathbb{Z}^k$-action is continuum-wise expansive,
then the values of its $(k-1)$-dimensional directional mean dimension are
bounded. This is a generalization (with a view towards Meyerovitch and
Tsukamoto's theorem on mean dimension and expansive multiparameter actions) of
a classical result due to Ma\~n\'e: Any compact metrizable space admitting an
expansive homeomorphism (with respect to a compatible metric) is
finite-dimensional.
Authors' comments: Comments welcome!
Weilun Wang, Wengang Zhou, Jianmin Bao, Dong Chen, Houqiang Li
Contrastive learning shows great potential in unpaired image-to-image
translation, but sometimes the translated results are in poor quality and the
contents are not preserved consistently. In this paper, we uncover that the
negative examples play a critical role in the performance of contrastive
learning for image translation. The negative examples in previous methods are
randomly sampled from the patches of different positions in the source image,
which are not effective to push the positive examples close to the query
examples. To address this issue, we present instance-wise hard Negative Example
Generation for Contrastive learning in Unpaired image-to-image Translation
(NEGCUT). Specifically, we train a generator to produce negative examples
online. The generator is novel from two perspectives: 1) it is instance-wise
which means that the generated examples are based on the input image, and 2) it
can generate hard negative examples since it is trained with an adversarial
loss. With the generator, the performance of unpaired image-to-image
translation is significantly improved. Experiments on three benchmark datasets
demonstrate that the proposed NEGCUT framework achieves state-of-the-art
performance compared to previous methods.
Authors' comments: Accepted by ICCV 2021
Haozhe Jia, Haoteng Tang, Guixiang Ma, Weidong Cai, Heng Huang, Liang Zhan, Yong Xia
Automated and accurate segmentation of the infected regions in computed tomography (CT) images is critical for the prediction of the pathological stage and treatment response of COVID-19. Several deep convolutional neural networks (DCNNs) have been designed for this task, whose performance, however, tends to be suppressed by their limited local receptive fields and insufficient global reasoning ability. In this paper, we propose a pixel-wise sparse graph reasoning (PSGR) module and insert it into a segmentation network to enhance the modeling of long-range dependencies for COVID-19 infected region segmentation in CT images. In the PSGR module, a graph is first constructed by projecting each pixel on a node based on the features produced by the segmentation backbone, and then converted into a sparsely-connected graph by keeping only K strongest connections to each uncertain pixel. The long-range information reasoning is performed on the sparsely-connected graph to generate enhanced features. The advantages of this module are two-fold: (1) the pixel-wise mapping strategy not only avoids imprecise pixel-to-node projections but also preserves the inherent information of each pixel for global reasoning; and (2) the sparsely-connected graph construction results in effective information retrieval and reduction of the noise propagation. The proposed solution has been evaluated against four widely-used segmentation models on three public datasets. The results show that the segmentation model equipped with our PSGR module can effectively segment COVID-19 infected regions in CT images, outperforming all other competing models.
Qin Wang, Jun Wei, Boyuan Wang, Zhen Li1, Sheng Wang, Shuguang Cu
Protein secondary structure prediction (PSSP) is essential for protein
function analysis. However, for low homologous proteins, the PSSP suffers from
insufficient input features. In this paper, we explicitly import external
self-supervised knowledge for low homologous PSSP under the guidance of
residue-wise profile fusion. In practice, we firstly demonstrate the
superiority of profile over Position-Specific Scoring Matrix (PSSM) for low
homologous PSSP. Based on this observation, we introduce the novel
self-supervised BERT features as the pseudo profile, which implicitly involves
the residue distribution in all native discovered sequences as the
complementary features. Further-more, a novel residue-wise attention is
specially designed to adaptively fuse different features (i.e.,original
low-quality profile, BERT based pseudo profile), which not only takes full
advantage of each feature but also avoids noise disturbance. Be-sides, the
feature consistency loss is proposed to accelerate the model learning from
multiple semantic levels. Extensive experiments confirm that our method
outperforms state-of-the-arts (i.e.,4.7%forextremely low homologous cases on
BC40 dataset).
Authors' comments: Accepted in IJCAI-21
Cong Liu, Chuang Zhang, Zhuoyi Yin, Xiaopeng Liu, Zhihong Xu
In fringe projection profilometry, the high-order harmonics information of non-sinusoidal fringes will lead to errors in the phase estimation. In order to solve this problem, a point-wise posterior phase estimation (PWPPE) method based on deep learning technique is proposed in this paper. The complex nonlinear mapping relationship between the multiple gray values and the sine / cosine value of the phase is constructed by using the feedforward neural network model. After the model training, it can estimate the phase values of each pixel location, and the accuracy is higher than the point-wise least-square (PWLS) method. To further verify the effectiveness of this method, a face mask is measured, the traditional PWLS method and the proposed PWPPE method are employed, respectively. The comparison results show that the traditional method is with periodic phase errors, while the proposed PWPPE method can effectively eliminate such phase errors caused by non-sinusoidal fringes.