Weiye Zhao, Rui Chen, Yifan Sun, Tianhao Wei, Changliu Liu
Reinforcement Learning (RL) algorithms have shown tremendous success in
simulation environments, but their application to real-world problems faces
significant challenges, with safety being a major concern. In particular,
enforcing state-wise constraints is essential for many challenging tasks such
as autonomous driving and robot manipulation. However, existing safe RL
algorithms under the framework of Constrained Markov Decision Process (CMDP) do
not consider state-wise constraints. To address this gap, we propose State-wise
Constrained Policy Optimization (SCPO), the first general-purpose policy search
algorithm for state-wise constrained reinforcement learning. SCPO provides
guarantees for state-wise constraint satisfaction in expectation. In
particular, we introduce the framework of Maximum Markov Decision Process, and
prove that the worst-case safety violation is bounded under SCPO. We
demonstrate the effectiveness of our approach on training neural network
policies for extensive robot locomotion tasks, where the agent must satisfy a
variety of state-wise safety constraints. Our results show that SCPO
significantly outperforms existing methods and can handle state-wise
constraints in high-dimensional robotics tasks.
Authors' comments: arXiv admin note: text overlap with arXiv:2305.13681
Guanchu Wang, Ninghao Liu, Daochen Zha, Xia Hu
Anomaly detection, where data instances are discovered containing feature patterns different from the majority, plays a fundamental role in various applications. However, it is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data. Appropriate interactions are needed to interact with the systems and identify those with abnormal responses. Detecting system-wise anomalies is a challenging task due to several reasons including: how to formally define the system-wise anomaly detection problem; how to find the effective activation signal for interacting with systems to progressively collect the data and learn the detector; how to guarantee stable training in such a non-stationary scenario with real-time interactions? To address the challenges, we propose InterSAD (Interactive System-wise Anomaly Detection). Specifically, first, we adopt Markov decision process to model the interactive systems, and define anomalous systems as anomalous transition and anomalous reward systems. Then, we develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings, and a policy network to generate effective activation for separating embeddings of normal and anomaly systems. Finally, we design a training method to stabilize the learning process, which includes a replay buffer to store historical interaction data and allow them to be re-sampled. Experiments on two benchmark environments, including identifying the anomalous robotic systems and detecting user data poisoning in recommendation models, demonstrate the superiority of InterSAD compared with state-of-the-art baselines methods.
Xu Ma, Yuqian Zhou, Xingqian Xu, Bin Sun, Valerii Filev, Nikita Orlov, Yun Fu, Humphrey Shi
Image rasterization is a mature technique in computer graphics, while image
vectorization, the reverse path of rasterization, remains a major challenge.
Recent advanced deep learning-based models achieve vectorization and semantic
interpolation of vector graphs and demonstrate a better topology of generating
new figures. However, deep models cannot be easily generalized to out-of-domain
testing data. The generated SVGs also contain complex and redundant shapes that
are not quite convenient for further editing. Specifically, the crucial
layer-wise topology and fundamental semantics in images are still not well
understood and thus not fully explored. In this work, we propose Layer-wise
Image Vectorization, namely LIVE, to convert raster images to SVGs and
simultaneously maintain its image topology. LIVE can generate compact SVG forms
with layer-wise structures that are semantically consistent with human
perspective. We progressively add new bezier paths and optimize these paths
with the layer-wise framework, newly designed loss functions, and
component-wise path initialization technique. Our experiments demonstrate that
LIVE presents more plausible vectorized forms than prior works and can be
generalized to new images. With the help of this newly learned topology, LIVE
initiates human editable SVGs for both designers and other downstream
applications. Codes are made available at
https://github.com/Picsart-AI-Research/LIVE-Layerwise-Image-Vectorization.
Authors' comments: Accepted as Oral Presentation at CVPR 2022
Hyeokjun Kweon, Hyeonseong Kim, Yoonsu Kang, Youngho Yoon, Wooseong Jeong, Kuk-Jin Yoon
Image stitching aims at stitching the images taken from different viewpoints into an image with a wider field of view. Existing methods warp the target image to the reference image using the estimated warp function, and a homography is one of the most commonly used warping functions. However, when images have large parallax due to non-planar scenes and translational motion of a camera, the homography cannot fully describe the mapping between two images. Existing approaches based on global or local homography estimation are not free from this problem and suffer from undesired artifacts due to parallax. In this paper, instead of relying on the homography-based warp, we propose a novel deep image stitching framework exploiting the pixel-wise warp field to handle the large-parallax problem. The proposed deep image stitching framework consists of two modules: Pixel-wise Warping Module (PWM) and Stitched Image Generating Module (SIGMo). PWM employs an optical flow estimation model to obtain pixel-wise warp of the whole image, and relocates the pixels of the target image with the obtained warp field. SIGMo blends the warped target image and the reference image while eliminating unwanted artifacts such as misalignments, seams, and holes that harm the plausibility of the stitched result. For training and evaluating the proposed framework, we build a large-scale dataset that includes image pairs with corresponding pixel-wise ground truth warp and sample stitched result images. We show that the results of the proposed framework are qualitatively superior to those of the conventional methods, especially when the images have large parallax. The code and the proposed dataset will be publicly available soon.
Firas Laakom, Jenni Raitoharju, Jarno Nikkanen, Alexandros Iosifidis, Moncef Gabbouj
Recently, Convolutional Neural Networks (CNNs) have been widely used to solve
the illuminant estimation problem and have often led to state-of-the-art
results. Standard approaches operate directly on the input image. In this
paper, we argue that this problem can be decomposed into three channel-wise
independent and symmetric sub-problems and propose a novel CNN-based
illumination estimation approach based on this decomposition. The proposed
method substantially reduces the number of parameters needed to solve the task
while achieving competitive experimental results compared to state-of-the-art
methods. Furthermore, the practical application of illumination estimation
techniques typically requires identifying the extreme error cases. This can be
achieved using an uncertainty estimation technique. In this work, we propose a
novel color constancy uncertainty estimation approach that augments the trained
model with an auxiliary branch which learns to predict the error based on the
feature representation. Intuitively, the model learns which feature
combinations are robust and are thus likely to yield low errors and which
combinations result in erroneous estimates. We test this approach on the
proposed method and show that it can indeed be used to avoid several extreme
error cases and, thus, improves the practicality of the proposed technique.
Authors' comments: 6 pages, 4 figures
József Balogh, Ce Chen, Kevin Hendrey, Ben Lund, Haoran Luo, Casey Tompkins, Tuan Tran
A family $\mathcal{F}$ on ground set $[n]:=\{1,2,\ldots, n\}$ is maximal
$k$-wise intersecting if every collection of at most $k$ sets in $\mathcal{F}$
has non-empty intersection, and no other set can be added to $\mathcal{F}$
while maintaining this property. In 1974, Erd\H{o}s and Kleitman asked for the
minimum size of a maximal $k$-wise intersecting family. We answer their
question for $k=3$ and sufficiently large $n$. We show that the unique minimum
family is obtained by partitioning the ground set $[n]$ into two sets $A$ and
$B$ with almost equal sizes and taking the family consisting of all the proper
supersets of $A$ and of $B$.
Authors' comments: 17 pages. This is a combination of the results from arXiv:2110.12708
(version 1) and the results from arXiv:2206.09334, which settled the even and
odd case of the problem, respectively
Pablo Cubides Kovacsics, Immanuel Halupczok
We introduce two new notions of stratifications in valued fields:
t$^2$-stratifications and arc-wise analytic t-stratifications. We show the
existence of arc-wise analytic t-stratifications in algebraically closed valued
fields with analytic structure in the sense of R. Cluckers and L. Lipshitz. We
prove that arc-wise analytic t-stratifications are t$^2$-stratifications and,
moreover, that t$^2$-stratifications are valuative Lipschitz stratifications as
defined by the second author and Y. Yin (the latter ones being closely related
to Lipschitz stratifications in the sense of Mostowski). Finally, we introduce
a combinatorial invariant associated to a t-stratification which we call the
critical value function. We explain how the critical value function of arc-wise
analytic t-stratifications can be used to formulate programatic conjectural
bounds for the Nash-Semple conjecture.
Authors' comments: Acknowledgements added. Comments are welcome!
Fabrizio Pugliese, Giovanni Sparano, Luca Vitagliano
We define a new notion of fiber-wise linear differential operator on the
total space of a vector bundle $E$. Our main result is that fiber-wise linear
differential operators on $E$ are equivalent to (polynomial) derivations of an
appropriate line bundle over $E^\ast$. We believe this might represent a first
step towards a definition of multiplicative (resp. infinitesimally
multiplicative) differential operators on a Lie groupoid (resp. a Lie
algebroid). We also discuss the linearization of a differential operator around
a submanifold.
Authors' comments: 26 pages, comments welcome!
Ahmadreza Moradipari, Christos Thrampoulidis, Mahnoosh Alizadeh
We study stage-wise conservative linear stochastic bandits: an instance of
bandit optimization, which accounts for (unknown) safety constraints that
appear in applications such as online advertising and medical trials. At each
stage, the learner must choose actions that not only maximize cumulative reward
across the entire time horizon but further satisfy a linear baseline constraint
that takes the form of a lower bound on the instantaneous reward. For this
problem, we present two novel algorithms, stage-wise conservative linear
Thompson Sampling (SCLTS) and stage-wise conservative linear UCB (SCLUCB), that
respect the baseline constraints and enjoy probabilistic regret bounds of order
O(\sqrt{T} \log^{3/2}T) and O(\sqrt{T} \log T), respectively. Notably, the
proposed algorithms can be adjusted with only minor modifications to tackle
different problem variations, such as constraints with bandit-feedback, or an
unknown sequence of baseline actions. We discuss these and other improvements
over the state-of-the-art. For instance, compared to existing solutions, we
show that SCLTS plays the (non-optimal) baseline action at most O(\log{T})
times (compared to O(\sqrt{T})). Finally, we make connections to another
studied form of safety constraints that takes the form of an upper bound on the
instantaneous reward. While this incurs additional complexity to the learning
process as the optimal action is not guaranteed to belong to the safe set at
each round, we show that SCLUCB can properly adjust in this setting via a
simple modification.
Authors' comments: 28 pages, 5 figures
Artur Jordao, Fernando Akio, Maiko Lie, William Robson Schwartz
Modern convolutional networks such as ResNet and NASNet have achieved
state-of-the-art results in many computer vision applications. These
architectures consist of stages, which are sets of layers that operate on
representations in the same resolution. It has been demonstrated that
increasing the number of layers in each stage improves the prediction ability
of the network. However, the resulting architecture becomes computationally
expensive in terms of floating point operations, memory requirements and
inference time. Thus, significant human effort is necessary to evaluate
different trade-offs between depth and performance. To handle this problem,
recent works have proposed to automatically design high-performance
architectures, mainly by means of neural architecture search (NAS). Current NAS
strategies analyze a large set of possible candidate architectures and, hence,
require vast computational resources and take many GPUs days. Motivated by
this, we propose a NAS approach to efficiently design accurate and low-cost
convolutional architectures and demonstrate that an efficient strategy for
designing these architectures is to learn the depth stage-by-stage. For this
purpose, our approach increases depth incrementally in each stage taking into
account its importance, such that stages with low importance are kept shallow
while stages with high importance become deeper. We conduct experiments on the
CIFAR and different versions of ImageNet datasets, where we show that
architectures discovered by our approach achieve better accuracy and efficiency
than human-designed architectures. Additionally, we show that architectures
discovered on CIFAR-10 can be successfully transferred to large datasets.
Compared to previous NAS approaches, our method is substantially more
efficient, as it evaluates one order of magnitude fewer models and yields
architectures on par with the state-of-the-art.
Authors' comments: Accepted for publication at International Conference on Pattern
Recognition (ICPR) 2020
Xiaoyang Guo, Kai Yang, Wukui Yang, Xiaogang Wang, Hongsheng Li
Stereo matching estimates the disparity between a rectified image pair, which
is of great importance to depth sensing, autonomous driving, and other related
tasks. Previous works built cost volumes with cross-correlation or
concatenation of left and right features across all disparity levels, and then
a 2D or 3D convolutional neural network is utilized to regress the disparity
maps. In this paper, we propose to construct the cost volume by group-wise
correlation. The left features and the right features are divided into groups
along the channel dimension, and correlation maps are computed among each group
to obtain multiple matching cost proposals, which are then packed into a cost
volume. Group-wise correlation provides efficient representations for measuring
feature similarities and will not lose too much information like full
correlation. It also preserves better performance when reducing parameters
compared with previous methods. The 3D stacked hourglass network proposed in
previous works is improved to boost the performance and decrease the inference
computational cost. Experiment results show that our method outperforms
previous methods on Scene Flow, KITTI 2012, and KITTI 2015 datasets. The code
is available at https://github.com/xy-guo/GwcNet
Authors' comments: accepted to CVPR 2019
A. M. Meisner, D. Lang, D. J. Schlegel
We have used the first ~3 years of 3.4 micron (W1) and 4.6 micron (W2)
observations from the WISE and NEOWISE missions to create a full-sky set of
time-resolved coadds. As a result of the WISE survey strategy, a typical sky
location is visited every six months and is observed during 12 or more
exposures per visit, with these exposures spanning a ~1 day time interval. We
have stacked the exposures within such ~1 day intervals to produce one coadd
per band per visit -- that is, one coadd every six months at a given position
on the sky in each of W1 and W2. For most parts of the sky we have generated
six epochal coadds per band, with one visit during the fully cryogenic WISE
mission, one visit during NEOWISE, and then, after a 33 month gap, four more
visits during the NEOWISE-Reactivation mission phase. These coadds are suitable
for studying long-timescale mid-infrared variability and measuring motions to
~1.3 magnitudes fainter than the single-exposure detection limit. In most sky
regions, our coadds span a 5.5 year time period and therefore provide a >10x
enhancement in time baseline relative to that available for the AllWISE
catalog's apparent motion measurements. As such, the signature application of
these new coadds is expected to be motion-based identification of relatively
faint brown dwarfs, especially those cold enough to remain undetected by Gaia.
Authors' comments: minor edits based on referee report; reference WiseView visualization
tool
Xin Yuan, Gang Huang, Hong Jiang, Paul Wilford
The existing lensless compressive camera
($\text{L}^2\text{C}^2$)~\cite{Huang13ICIP} suffers from low capture rates,
resulting in low resolution images when acquired over a short time. In this
work, we propose a new regime to mitigate these drawbacks. We replace the
global-based compressive sensing used in the existing $\text{L}^2\text{C}^2$ by
the local block (patch) based compressive sensing. We use a single sensor for
each block, rather than for the entire image, thus forming a multiple but
spatially parallel sensor $\text{L}^2\text{C}^2$. This new camera retains the
advantages of existing $\text{L}^2\text{C}^2$ while leading to the following
additional benefits: 1) Since each block can be very small, {\em e.g.}$~8\times
8$ pixels, we only need to capture $\sim 10$ measurements to achieve reasonable
reconstruction. Therefore the capture time can be reduced significantly. 2) The
coding patterns used in each block can be the same, therefore the sensing
matrix is only of the block size compared to the entire image size in existing
$\text{L}^2\text{C}^2$. This saves the memory requirement of the sensing matrix
as well as speeds up the reconstruction. 3) Patch based image reconstruction is
fast and since real time stitching algorithms exist, we can perform real time
reconstruction. 4) These small blocks can be integrated to any desirable
number, leading to ultra high resolution images while retaining fast capture
rate and fast reconstruction. We develop multiple geometries of this block-wise
$\text{L}^2\text{C}^2$ in this paper. We have built prototypes of the proposed
block-wise $\text{L}^2\text{C}^2$ and demonstrated excellent results of real
data.
Authors' comments: 5 pages, 10 figures
Eiichi Matsuhashi, Vesko Valov
We introduce the notion of set-wise injective maps and provide results about
fiber embeddings. Our results improve some previous results in this area.
Authors' comments: 11 pages
Clintin Davis-Stober, David Budescu, Jason Dana, Stephen Broomell
Numerous studies and anecdotes demonstrate the "wisdom of the crowd," the surprising accuracy of a group's aggregated judgments. Less is known, however, about the generality of crowd wisdom. For example, are crowds wise even if their members have systematic judgmental biases, or can influence each other before members render their judgments? If so, are there situations in which we can expect a crowd to be less accurate than skilled individuals? We provide a precise but general definition of crowd wisdom: A crowd is wise if a linear aggregate, for example a mean, of its members' judgments is closer to the target value than a randomly, but not necessarily uniformly, sampled member of the crowd. Building on this definition, we develop a theoretical framework for examining, a priori, when and to what degree a crowd will be wise. We systematically investigate the boundary conditions for crowd wisdom within this framework and determine conditions under which the accuracy advantage for crowds is maximized. Our results demonstrate that crowd wisdom is highly robust: Even if judgments are biased and correlated, one would need to nearly deterministically select only a highly skilled judge before an individual's judgment could be expected to be more accurate than a simple averaging of the crowd. Our results also provide an accuracy rationale behind the need for diversity of judgments among group members. Contrary to folk explanations of crowd wisdom which hold that judgments should ideally be independent so that errors cancel out, we find that crowd wisdom is maximized when judgments systematically differ as much as possible. We re-analyze data from two published studies that confirm our theoretical results.
Robert Nikutta, Maia Nenkova, Zeljko Ivezic, Nicholas Hunt-Walker, Moshe Elitzur
The Wide-field Infrared Survey Explorer (WISE) has scanned the entire sky
with unprecedented sensitivity in four infrared bands, at 3.4, 4.6, 12, and 22
micron. The WISE Point Source Catalog contains more than 560 million objects,
among them hundreds of thousands of galaxies with Active Nuclei (AGN). While
type 1 AGN, owing to their bright and unobscured nature, are easy to detect and
constitute a rather complete and unbiased sample, their type 2 counterparts,
postulated by AGN unification, are not as straightforward to identify. Matching
the WISE catalog with known QSOs in the Sloan Digital Sky Survey we confirm
previous identification of the type 1 locus in the WISE color space. Using a
very large database of the popular CLUMPY torus models, we find the colors of
the putative type 2 counterparts, and also, for the first time, predict their
number vs. flux relation that can be expected to be observed in any given WISE
color range. This will allow us to put statistically very significant
constraints on the torus parameters. Our results are a successful test of the
AGN unification scheme.
Authors' comments: 4 pages, 1 figure, presented at the IAU Symposium #304
A. Evans, R. D. Gehrz, C. E. Woodward, L. A. Helton
We present the result of trawling through the WISE archive for data on
classical and recurrent novae. The data show a variety of spectral energy
distributions, including stellar photospheres, dust and probable line emission.
During the mission WISE also detected some novae which erupted subsequent to
the survey, providing information about the progenitor systems.
Authors' comments: To appear in proceedings of "Stella Novae: Future and Past Decades"
Youngjin Bae
In this article we introduce the notion of a magnetic leaf-wise intersection
point which is a generalization of the leaf-wise intersection point with
magnetic effects. We also prove the existence of magnetic leaf-wise
intersection points under certain topological assumptions.
Authors' comments: 43 pages
Pietro Asinari, Taku Ohwada, Eliodoro Chiavazzo, Antonio Fabio Di Rienzo
The Artificial Compressibility Method (ACM) for the incompressible
Navier-Stokes equations is (link-wise) reformulated (referred to as LW-ACM) by
a finite set of discrete directions (links) on a regular Cartesian mesh, in
analogy with the Lattice Boltzmann Method (LBM). The main advantage is the
possibility of exploiting well established technologies originally developed
for LBM and classical computational fluid dynamics, with special emphasis on
finite differences (at least in the present paper), at the cost of minor
changes. For instance, wall boundaries not aligned with the background
Cartesian mesh can be taken into account by tracing the intersections of each
link with the wall (analogously to LBM technology). LW-ACM requires no
high-order moments beyond hydrodynamics (often referred to as ghost moments)
and no kinetic expansion. Like finite difference schemes, only standard Taylor
expansion is needed for analyzing consistency. Preliminary efforts towards
optimal implementations have shown that LW-ACM is capable of similar
computational speed as optimized (BGK-) LBM. In addition, the memory demand is
significantly smaller than (BGK-) LBM. Importantly, with an efficient
implementation, this algorithm may be one of the few which is compute-bound and
not memory-bound. Two- and three-dimensional benchmarks are investigated, and
an extensive comparative study between the present approach and state of the
art methods from the literature is carried out. Numerical evidences suggest
that LW-ACM represents an excellent alternative in terms of simplicity,
stability and accuracy.
Authors' comments: 62 pages, 20 figures
Noga Alon, Asaf Nussboim
We study the k-wise independent relaxation of the usual model G(N,p) of
random graphs where, as in this model, N labeled vertices are fixed and each
edge is drawn with probability p, however, it is only required that the
distribution of any subset of k edges is independent. This relaxation can be
relevant in modeling phenomena where only k-wise independence is assumed to
hold, and is also useful when the relevant graphs are so huge that handling
G(N,p) graphs becomes infeasible, and cheaper random-looking distributions
(such as k-wise independent ones) must be used instead. Unfortunately, many
well-known properties of random graphs in G(N,p) are global, and it is thus not
clear if they are guaranteed to hold in the k-wise independent case. We explore
the properties of k-wise independent graphs by providing upper-bounds and
lower-bounds on the amount of independence, k, required for maintaining the
main properties of G(N,p) graphs: connectivity, Hamiltonicity, the
connectivity-number, clique-number and chromatic-number and the appearance of
fixed subgraphs. Most of these properties are shown to be captured by either
constant k or by some k= poly(log(N)) for a wide range of values of p, implying
that random looking graphs on N vertices can be generated by a seed of size
poly(log(N)). The proofs combine combinatorial, probabilistic and spectral
techniques.
Authors' comments: 23 pages