Ali Alshehri, Jonathan P. Rothstein, H. Pirouz Kavehpour
Drop-wise condensation (DWC) has been the focus of scientific research in
vapor condensation technologies since the 20th century. Improvement of
condensation rate in DWC is limited by the maximum droplet a condensation
surface could sustain. Furthermore, the presence of non-condensable gases (NCG)
reduces the condensation rate significantly. Here, we present continuous
drop-wise condensation (CDC) to overcome the need of hydrophobic surfaces while
yet maintaining micron-sized droplets. By shifting focus from surface treatment
to the force required to sweep off a droplet, we were able to utilize
stagnation pressure of jet impingement to tune the shed droplet size. The
results show that droplet size being shed can be tuned effectively by tuning
the jet parameters. our experimental observations showed that the effect of NCG
is greatly alleviated by utilizing our technique. An improvement by at least
six folds in mass transfer compactness factor compared to state-of-the-art
dehumidification technology was possible.
Authors' comments: Videos are available upon request
Changlin Li, Tao Tang, Guangrun Wang, Jiefeng Peng, Bing Wang, Xiaodan Liang, Xiaojun Chang
A myriad of recent breakthroughs in hand-crafted neural architectures for
visual recognition have highlighted the urgent need to explore hybrid
architectures consisting of diversified building blocks. Meanwhile, neural
architecture search methods are surging with an expectation to reduce human
efforts. However, whether NAS methods can efficiently and effectively handle
diversified search spaces with disparate candidates (e.g. CNNs and
transformers) is still an open question. In this work, we present Block-wisely
Self-supervised Neural Architecture Search (BossNAS), an unsupervised NAS
method that addresses the problem of inaccurate architecture rating caused by
large weight-sharing space and biased supervision in previous methods. More
specifically, we factorize the search space into blocks and utilize a novel
self-supervised training scheme, named ensemble bootstrapping, to train each
block separately before searching them as a whole towards the population
center. Additionally, we present HyTra search space, a fabric-like hybrid
CNN-transformer search space with searchable down-sampling positions. On this
challenging search space, our searched model, BossNet-T, achieves up to 82.5%
accuracy on ImageNet, surpassing EfficientNet by 2.4% with comparable compute
time. Moreover, our method achieves superior architecture rating accuracy with
0.78 and 0.76 Spearman correlation on the canonical MBConv search space with
ImageNet and on NATS-Bench size search space with CIFAR-100, respectively,
surpassing state-of-the-art NAS methods. Code:
https://github.com/changlin31/BossNAS
Authors' comments: Accepted to ICCV 2021
Jun Zeng, Bike Zhang, Zhongyu Li, Koushil Sreenath
Safety is one of the fundamental problems in robotics. Recently, a quadratic program-based control barrier function (CBF) method has emerged as a way to enforce safety-critical constraints. Together with control Lyapunov function (CLF), it forms a safety-critical control strategy, named CLF-CBF-QP, which can mediate between achieving the control objective and ensuring safety, while being executable in real-time. However, once additional constraints such as input constraints are introduced, the CLF-CBF-QP may encounter infeasibility. In order to address the challenge that arises due to the infeasibility, we propose an optimal-decay form for safety-critical control wherein the decay rate of the CBF is optimized point-wise in time so as to guarantee point-wise feasibility when the state lies inside the safe set. The proposed control design is numerically validated using an adaptive cruise control example.
Amirabbas Davari, Christoph Baller, Thorsten Seehaus, Matthias Braun, Andreas Maier, Vincent Christlein
Glacier calving front position (CFP) is an important glaciological variable. Traditionally, delineating the CFPs has been carried out manually, which was subjective, tedious and expensive. Automating this process is crucial for continuously monitoring the evolution and status of glaciers. Recently, deep learning approaches have been investigated for this application. However, the current methods get challenged by a severe class-imbalance problem. In this work, we propose to mitigate the class-imbalance between the calving front class and the non-calving front class by reformulating the segmentation problem into a pixel-wise regression task. A Convolutional Neural Network gets optimized to predict the distance values to the glacier front for each pixel in the image. The resulting distance map localizes the CFP and is further post-processed to extract the calving front line. We propose three post-processing methods, one method based on statistical thresholding, a second method based on conditional random fields (CRF), and finally the use of a second U-Net. The experimental results confirm that our approach significantly outperforms the state-of-the-art methods and produces accurate delineation. The Second U-Net obtains the best performance results, resulting in an average improvement of about 21% dice coefficient enhancement.
Chen Chen, Kezhi Kong, Peihong Yu, Juan Luque, Tom Goldstein, Furong Huang
Randomized smoothing (RS) is an effective and scalable technique for
constructing neural network classifiers that are certifiably robust to
adversarial perturbations. Most RS works focus on training a good base model
that boosts the certified robustness of the smoothed model. However, existing
RS techniques treat every data point the same, i.e., the variance of the
Gaussian noise used to form the smoothed model is preset and universal for all
training and test data. This preset and universal Gaussian noise variance is
suboptimal since different data points have different margins and the local
properties of the base model vary across the input examples. In this paper, we
examine the impact of customized handling of examples and propose Instance-wise
Randomized Smoothing (Insta-RS) -- a multiple-start search algorithm that
assigns customized Gaussian variances to test examples. We also design Insta-RS
Train -- a novel two-stage training algorithm that adaptively adjusts and
customizes the noise level of each training example for training a base model
that boosts the certified robustness of the instance-wise Gaussian smoothed
model. Through extensive experiments on CIFAR-10 and ImageNet, we show that our
method significantly enhances the average certified radius (ACR) as well as the
clean data accuracy compared to existing state-of-the-art provably robust
classifiers.
Authors' comments: We plan to make major modifications to this paper including rewriting
the entire text, rewriting the proofs and adding experiments. Given that the
paper will be completely different, we decided to take this paper down
temporarily
Ahmed Rasheed, Muhammad Shahzad Younis, Farooq Ahmad, Junaid Qadir, Muhammad Kashif
Wheat is the main agricultural crop of Pakistan and is a staple food
requirement of almost every Pakistani household making it the main strategic
commodity of the country whose availability and affordability is the
government's main priority. Wheat food availability can be vastly affected by
multiple factors included but not limited to the production, consumption,
financial crisis, inflation, or volatile market. The government ensures food
security by particular policy and monitory arrangements, which keeps up
purchase parity for the poor. Such arrangements can be made more effective if a
dynamic analysis is carried out to estimate the future yield based on certain
current factors. Future planning of commodity pricing is achievable by
forecasting their future price anticipated by the current circumstances. This
paper presents a wheat price forecasting methodology, which uses the price,
weather, production, and consumption trends for wheat prices taken over the
past few years and analyzes them with the help of advance neural networks
architecture Long Short Term Memory (LSTM) networks. The proposed methodology
presented significantly improved results versus other conventional machine
learning and statistical time series analysis methods.
Authors' comments: 9 pages, submitted to IEEE Access
Antonio Joia Neto, Andre G C Pacheco, Diogo C Luvizon
In this paper, we propose a new Sound Event Classification (SEC) method which
is inspired in recent works for out-of-distribution detection. In our method,
we analyse all the activations of a generic CNN in order to produce feature
representations using Gram Matrices. The similarity metrics are evaluated
considering all possible classes, and the final prediction is defined as the
class that minimizes the deviation with respect to the features seeing during
training. The proposed approach can be applied to any CNN and our experimental
evaluation of four different architectures on two datasets demonstrated that
our method consistently improves the baseline models.
Authors' comments: To appear on ICASSP 2021
Ken'ichiro Tanaka
We propose a method for generating nodes for kernel quadrature by a
point-wise gradient descent method. For kernel quadrature, most methods for
generating nodes are based on the worst case error of a quadrature formula in a
reproducing kernel Hilbert space corresponding to the kernel. In typical ones
among those methods, a new node is chosen among a candidate set of points in
each step by an optimization problem with respect to a new node. Although such
sequential methods are appropriate for adaptive quadrature, it is difficult to
apply standard routines for mathematical optimization to the problem. In this
paper, we propose a method that updates a set of points one by one with a
simple gradient descent method. To this end, we provide an upper bound of the
worst case error by using the fundamental solution of the Laplacian on
$\mathbf{R}^{d}$. We observe the good performance of the proposed method by
numerical experiments.
Authors' comments: 21 pages, 12 figures
Seyedehsara, Nayer, Namrata Vaswani
We study the following lesser-known low rank (LR) recovery problem: recover
an $n \times q$ rank-$r$ matrix, $X^* =[x^*_1 , x^*_2, ..., x^*_q]$, with $r
\ll \min(n,q)$, from $m$ independent linear projections of each of its $q$
columns, i.e., from $y_k := A_k x^*_k , k \in [q]$, when $y_k$ is an $m$-length
vector with $m < n$. The matrices $A_k$ are known and mutually independent for
different $k$. We introduce a novel gradient descent (GD) based solution called
AltGD-Min. We show that, if the $A_k$s are i.i.d. with i.i.d. Gaussian entries,
and if the right singular vectors of $X^*$ satisfy the incoherence assumption,
then $\epsilon$-accurate recovery of $X^*$ is possible with order $(n+q) r^2
\log(1/\epsilon)$ total samples and order $ mq nr \log (1/\epsilon)$ time.
Compared with existing work, this is the fastest solution. For $\epsilon <
r^{1/4}$, it also has the best sample complexity. A simple extension of
AltGD-Min also provably solves LR Phase Retrieval, which is a magnitude-only
generalization of the above problem. AltGD-Min factorizes the unknown $X$ as $X
= UB$ where $U$ and $B$ are matrices with $r$ columns and rows respectively. It
alternates between a (projected) GD step for updating $U$, and a minimization
step for updating $B$. Its each iteration is as fast as that of regular
projected GD because the minimization over $B$ decouples column-wise. At the
same time, we can prove exponential error decay for it, which we are unable to
for projected GD. Finally, it can also be efficiently federated with a
communication cost of only $nr$ per node, instead of $nq$ for projected GD.
Authors' comments: To appear in IEEE Transactions on Information Theory (T-IT)
Hanshu Yan, Jingfeng Zhang, Gang Niu, Jiashi Feng, Vincent Y. F. Tan, Masashi Sugiyama
We investigate the adversarial robustness of CNNs from the perspective of channel-wise activations. By comparing \textit{non-robust} (normally trained) and \textit{robustified} (adversarially trained) models, we observe that adversarial training (AT) robustifies CNNs by aligning the channel-wise activations of adversarial data with those of their natural counterparts. However, the channels that are \textit{negatively-relevant} (NR) to predictions are still over-activated when processing adversarial data. Besides, we also observe that AT does not result in similar robustness for all classes. For the robust classes, channels with larger activation magnitudes are usually more \textit{positively-relevant} (PR) to predictions, but this alignment does not hold for the non-robust classes. Given these observations, we hypothesize that suppressing NR channels and aligning PR ones with their relevances further enhances the robustness of CNNs under AT. To examine this hypothesis, we introduce a novel mechanism, i.e., \underline{C}hannel-wise \underline{I}mportance-based \underline{F}eature \underline{S}election (CIFS). The CIFS manipulates channels' activations of certain layers by generating non-negative multipliers to these channels based on their relevances to predictions. Extensive experiments on benchmark datasets including CIFAR10 and SVHN clearly verify the hypothesis and CIFS's effectiveness of robustifying CNNs. \url{https://github.com/HanshuYAN/CIFS}
Zhiqiang Wang, Qingyun She, Junlin Zhang
Click-Through Rate(CTR) estimation has become one of the most fundamental
tasks in many real-world applications and it's important for ranking models to
effectively capture complex high-order features. Shallow feed-forward network
is widely used in many state-of-the-art DNN models such as FNN, DeepFM and
xDeepFM to implicitly capture high-order feature interactions. However, some
research has proved that addictive feature interaction, particular feed-forward
neural networks, is inefficient in capturing common feature interaction. To
resolve this problem, we introduce specific multiplicative operation into DNN
ranking system by proposing instance-guided mask which performs element-wise
product both on the feature embedding and feed-forward layers guided by input
instance. We also turn the feed-forward layer in DNN model into a mixture of
addictive and multiplicative feature interactions by proposing MaskBlock in
this paper. MaskBlock combines the layer normalization, instance-guided mask,
and feed-forward layer and it is a basic building block to be used to design
new ranking model under various configurations. The model consisting of
MaskBlock is called MaskNet in this paper and two new MaskNet models are
proposed to show the effectiveness of MaskBlock as basic building block for
composing high performance ranking systems. The experiment results on three
real-world datasets demonstrate that our proposed MaskNet models outperform
state-of-the-art models such as DeepFM and xDeepFM significantly, which implies
MaskBlock is an effective basic building unit for composing new high
performance ranking systems.
Authors' comments: In Proceedings of DLP-KDD 2021. ACM,Singapore. arXiv admin note: text
overlap with arXiv:2006.12753
Huu-Thiet Nguyen, Chien Chern Cheah, Kar-Ann Toh
Deep learning (DL) has achieved great success in many applications, but it
has been less well analyzed from the theoretical perspective. The unexplainable
success of black-box DL models has raised questions among scientists and
promoted the emergence of the field of explainable artificial intelligence
(XAI). In robotics, it is particularly important to deploy DL algorithms in a
predictable and stable manner as robots are active agents that need to interact
safely with the physical world. This paper presents an analytic deep learning
framework for fully connected neural networks, which can be applied for both
regression problems and classification problems. Examples for regression and
classification problems include online robot control and robot vision. We
present two layer-wise learning algorithms such that the convergence of the
learning systems can be analyzed. Firstly, an inverse layer-wise learning
algorithm for multilayer networks with convergence analysis for each layer is
presented to understand the problems of layer-wise deep learning. Secondly, a
forward progressive learning algorithm where the deep networks are built
progressively by using single hidden layer networks is developed to achieve
better accuracy. It is shown that the progressive learning method can be used
for fine-tuning of weights from convergence point of view. The effectiveness of
the proposed framework is illustrated based on classical benchmark recognition
tasks using the MNIST and CIFAR-10 datasets and the results show a good balance
between performance and explainability. The proposed method is subsequently
applied for online learning of robot kinematics and experimental results on
kinematic control of UR5e robot with unknown model are presented.
Authors' comments: The paper has been published in Automatica
Kanchan Chowdhury, Ankita Sharma, Arun Deepak Chandrasekar
Increasing the batch size of a deep learning model is a challenging task. Although it might help in utilizing full available system memory during training phase of a model, it results in significant loss of test accuracy most often. LARS solved this issue by introducing an adaptive learning rate for each layer of a deep learning model. However, there are doubts on how popular distributed machine learning systems such as SystemML or MLlib will perform with this optimizer. In this work, we apply LARS optimizer to a deep learning model implemented using SystemML.We perform experiments with various batch sizes and compare the performance of LARS optimizer with \textit{Stochastic Gradient Descent}. Our experimental results show that LARS optimizer performs significantly better than Stochastic Gradient Descent for large batch sizes even with the distributed machine learning framework, SystemML.
Long Chen, Junyu Dong, Huiyu Zhou
Underwater object detection technique is of great significance for various applications in underwater the scenes. However, class imbalance issue is still an unsolved bottleneck for current underwater object detection algorithms. It leads to large precision discrepancies among different classes that the dominant classes with more training data achieve higher detection precisions while the minority classes with fewer training data achieves much lower detection precisions. In this paper, we propose a novel class-wise style augmentation (CWSA) algorithm to generate a class-balanced underwater dataset Balance18 from the public contest underwater dataset URPC2018. CWSA is a new kind of data augmentation technique which augments the training data for the minority classes by generating various colors, textures and contrasts for the minority classes. Compare with previous data augmentation algorithms such flipping, cropping and rotations, CWSA is able to generate a class balanced underwater dataset with diverse color distortions and haze-effects.
Shihao Zhao, Xingjun Ma, Yisen Wang, James Bailey, Bo Li, Yu-Gang Jiang
Deep neural networks (DNNs) are increasingly deployed in different applications to achieve state-of-the-art performance. However, they are often applied as a black box with limited understanding of what knowledge the model has learned from the data. In this paper, we focus on image classification and propose a method to visualize and understand the class-wise knowledge (patterns) learned by DNNs under three different settings including natural, backdoor and adversarial. Different to existing visualization methods, our method searches for a single predictive pattern in the pixel space to represent the knowledge learned by the model for each class. Based on the proposed method, we show that DNNs trained on natural (clean) data learn abstract shapes along with some texture, and backdoored models learn a suspicious pattern for the backdoored class. Interestingly, the phenomenon that DNNs can learn a single predictive pattern for each class indicates that DNNs can learn a backdoor even from clean data, and the pattern itself is a backdoor trigger. In the adversarial setting, we show that adversarially trained models tend to learn more simplified shape patterns. Our method can serve as a useful tool to better understand the knowledge learned by DNNs on different datasets under different settings.
Mateus Roder, Leandro A. Passos, Luiz Carlos Felix Ribeiro, Clayton Pereira, João Paulo Papa
With the advent of deep learning, the number of works proposing new methods or improving existent ones has grown exponentially in the last years. In this scenario, "very deep" models were emerging, once they were expected to extract more intrinsic and abstract features while supporting a better performance. However, such models suffer from the gradient vanishing problem, i.e., backpropagation values become too close to zero in their shallower layers, ultimately causing learning to stagnate. Such an issue was overcome in the context of convolution neural networks by creating "shortcut connections" between layers, in a so-called deep residual learning framework. Nonetheless, a very popular deep learning technique called Deep Belief Network still suffers from gradient vanishing when dealing with discriminative tasks. Therefore, this paper proposes the Residual Deep Belief Network, which considers the information reinforcement layer-by-layer to improve the feature extraction and knowledge retaining, that support better discriminative performance. Experiments conducted over three public datasets demonstrate its robustness concerning the task of binary image classification.
Dan Xu, Xavier Alameda-Pineda, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, Nicu Sebe
Multi-scale representations deeply learned via convolutional neural networks
have shown tremendous importance for various pixel-level prediction problems.
In this paper we present a novel approach that advances the state of the art on
pixel-level prediction in a fundamental aspect, i.e. structured multi-scale
features learning and fusion. In contrast to previous works directly
considering multi-scale feature maps obtained from the inner layers of a
primary CNN architecture, and simply fusing the features with weighted
averaging or concatenation, we propose a probabilistic graph attention network
structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs)
model for learning and fusing multi-scale representations in a principled
manner. In order to further improve the learning capacity of the network
structure, we propose to exploit feature dependant conditional kernels within
the deep probabilistic framework. Extensive experiments are conducted on four
publicly available datasets (i.e. BSDS500, NYUD-V2, KITTI, and Pascal-Context)
and on three challenging pixel-wise prediction problems involving both discrete
and continuous labels (i.e. monocular depth estimation, object contour
prediction, and semantic segmentation). Quantitative and qualitative results
demonstrate the effectiveness of the proposed latent AG-CRF model and the
overall probabilistic graph attention network with feature conditional kernels
for structured feature learning and pixel-wise prediction.
Authors' comments: Regular paper accepted at TPAMI 2020. arXiv admin note: text overlap
with arXiv:1801.00524
E. R. Ferris, A. W. Blain, R. J. Assef, N. A. Hatch, A. Kimball, M. Kim, A. Sajina, A. Silva et al.
We present near-IR photometry and spectroscopy of 30 extremely luminous radio
and mid-IR selected galaxies. With bolometric luminosities exceeding
$\sim10^{13}$ $\rm{L_{\odot}}$ and redshifts ranging from $z = 0.880-2.853$, we
use VLT instruments X-shooter and ISAAC to investigate this unique population
of galaxies. Broad multi-component emission lines are detected in 18 galaxies
and we measure the near-IR lines $\rm{H\rm{\beta}}$,
$\text{[OIII]}\rm{\lambda}\rm{\lambda}4959,5007$ and $\rm{H\rm{\alpha}}$ in
six, 15 and 13 galaxies respectively, with 10 $\rm{Ly\alpha}$ and five CIV
lines additionally detected in the UVB arm. We use the broad
$\text{[OIII]}\rm{\lambda}5007$ emission lines as a proxy for the bolometric
AGN luminosity, and derive lower limits to supermassive black hole masses of
$10^{7.9}$-$10^{9.4}$ $\text{M}_{\odot}$ with expectations of corresponding
host masses of $10^{10.4}$-$10^{12.0}$ $\text{M}_{\odot}$. We measure
$\rm{\lambda}_{Edd}$ > 1 for eight of these sources at a $2\sigma$
significance. Near-IR photometry and SED fitting are used to compare stellar
masses directly. We detect both Balmer lines in five galaxies and use these to
infer a mean visual extinction of $A_{V}$ = 2.68 mag. Due to non-detections and
uncertainties in our $\rm{H\rm{\beta}}$ emission line measurements, we simulate
a broad $\rm{H\rm{\beta}}$ line of FWHM = 1480 $\rm{kms^{-1}}$ to estimate
extinction for all sources with measured $\rm{H\rm{\alpha}}$ emission. We then
use this to infer a mean $A_{V}=3.62$ mag, demonstrating the highly-obscured
nature of these galaxies, with the consequence of increasing our estimates of
black-hole masses by an 0.5 orders of magnitude in the most extreme and
obscured cases.
Authors' comments: Accepted for publication in MNRAS. 14 pages (+8 page appendix), 11
figures and 9 tables
Cheeun Hong, Heewon Kim, Sungyong Baik, Junghun Oh, Kyoung Mu Lee
Quantizing deep convolutional neural networks for image super-resolution
substantially reduces their computational costs. However, existing works either
suffer from a severe performance drop in ultra-low precision of 4 or lower
bit-widths, or require a heavy fine-tuning process to recover the performance.
To our knowledge, this vulnerability to low precisions relies on two
statistical observations of feature map values. First, distribution of feature
map values varies significantly per channel and per input image. Second,
feature maps have outliers that can dominate the quantization error. Based on
these observations, we propose a novel distribution-aware quantization scheme
(DAQ) which facilitates accurate training-free quantization in ultra-low
precision. A simple function of DAQ determines dynamic range of feature maps
and weights with low computational burden. Furthermore, our method enables
mixed-precision quantization by calculating the relative sensitivity of each
channel, without any training process involved. Nonetheless, quantization-aware
training is also applicable for auxiliary performance gain. Our new method
outperforms recent training-free and even training-based quantization methods
to the state-of-the-art image super-resolution networks in ultra-low precision.
Authors' comments: WACV 2022
S. Valère Bitseki Penda, Jean-François Delmas
Bifurcating Markov chains (BMC) are Markov chains indexed by a full binary
tree representing the evolution of a trait along a population where each
individual has two children. We provide a central limit theorem for general
additive functionals of BMC, and prove the existence of three regimes. This
corresponds to a competition between the reproducing rate (each individual has
two children) and the ergodicity rate for the evolution of the trait. This is
in contrast with the work of Guyon (2007), where the considered additive
functionals are sums of martingale increments, and only one regime appears. Our
result can be seen as a discrete time version, but with general trait
evolution, of results in the time continuous setting of branching particle
system from Adamczak and Mi\l{}o\'{s} (2015), where the evolution of the trait
is given by an Ornstein-Uhlenbeck process.
Authors' comments: 32