Chengxin Chen, Pengyuan Zhang
Previous research has looked into ways to improve speech emotion recognition
(SER) by utilizing both acoustic and linguistic cues of speech. However, the
potential association between state-of-the-art ASR models and the SER task has
yet to be investigated. In this paper, we propose a novel channel and
temporal-wise attention RNN (CTA-RNN) architecture based on the intermediate
representations of pre-trained ASR models. Specifically, the embeddings of a
large-scale pre-trained end-to-end ASR encoder contain both acoustic and
linguistic information, as well as the ability to generalize to different
speakers, making them well suited for downstream SER task. To further exploit
the embeddings from different layers of the ASR encoder, we propose a novel
CTA-RNN architecture to capture the emotional salient parts of embeddings in
both the channel and temporal directions. We evaluate our approach on two
popular benchmark datasets, IEMOCAP and MSP-IMPROV, using both within-corpus
and cross-corpus settings. Experimental results show that our proposed method
can achieve excellent performance in terms of accuracy and robustness.
Authors' comments: 5 pages, 2 figures, submitted to INTERSPEECH 2022
Ehsan Kamalloo, Mehdi Rezagholizadeh, Ali Ghodsi
Data Augmentation (DA) is known to improve the generalizability of deep
neural networks. Most existing DA techniques naively add a certain number of
augmented samples without considering the quality and the added computational
cost of these samples. To tackle this problem, a common strategy, adopted by
several state-of-the-art DA methods, is to adaptively generate or re-weight
augmented samples with respect to the task objective during training. However,
these adaptive DA methods: (1) are computationally expensive and not
sample-efficient, and (2) are designed merely for a specific setting. In this
work, we present a universal DA technique, called Glitter, to overcome both
issues. Glitter can be plugged into any DA method, making training
sample-efficient without sacrificing performance. From a pre-generated pool of
augmented samples, Glitter adaptively selects a subset of worst-case samples
with maximal loss, analogous to adversarial DA. Without altering the training
strategy, the task objective can be optimized on the selected subset. Our
thorough experiments on the GLUE benchmark, SQuAD, and HellaSwag in three
widely used training setups including consistency training, self-distillation
and knowledge distillation reveal that Glitter is substantially faster to train
and achieves a competitive performance, compared to strong baselines.
Authors' comments: ACL 2022 Findings
Ilias Chalkidis, Anders Søgaard
In document classification for, e.g., legal and biomedical text, we often
deal with hundreds of classes, including very infrequent ones, as well as
temporal concept drift caused by the influence of real world events, e.g.,
policy changes, conflicts, or pandemics. Class imbalance and drift can
sometimes be mitigated by resampling the training data to simulate (or
compensate for) a known target distribution, but what if the target
distribution is determined by unknown future events? Instead of simply
resampling uniformly to hedge our bets, we focus on the underlying optimization
algorithms used to train such document classifiers and evaluate several
group-robust optimization algorithms, initially proposed to mitigate
group-level disparities. Reframing group-robust algorithms as adaptation
algorithms under concept drift, we find that Invariant Risk Minimization and
Spectral Decoupling outperform sampling-based approaches to class imbalance and
concept drift, and lead to much better performance on minority classes. The
effect is more pronounced the larger the label set.
Authors' comments: 9 pages, long paper at ACL 2022 Findings
Shivani Kumar, Atharva Kulkarni, Md Shad Akhtar, Tanmoy Chakraborty
Indirect speech such as sarcasm achieves a constellation of discourse goals
in human communication. While the indirectness of figurative language warrants
speakers to achieve certain pragmatic goals, it is challenging for AI agents to
comprehend such idiosyncrasies of human communication. Though sarcasm
identification has been a well-explored topic in dialogue analysis, for
conversational systems to truly grasp a conversation's innate meaning and
generate appropriate responses, simply detecting sarcasm is not enough; it is
vital to explain its underlying sarcastic connotation to capture its true
essence. In this work, we study the discourse structure of sarcastic
conversations and propose a novel task - Sarcasm Explanation in Dialogue (SED).
Set in a multimodal and code-mixed setting, the task aims to generate natural
language explanations of satirical conversations. To this end, we curate WITS,
a new dataset to support our task. We propose MAF (Modality Aware Fusion), a
multimodal context-aware attention and global information fusion module to
capture multimodality and use it to benchmark WITS. The proposed attention
module surpasses the traditional multimodal fusion baselines and reports the
best performance on almost all metrics. Lastly, we carry out detailed analyses
both quantitatively and qualitatively.
Authors' comments: Accepted in ACL 2022. 13 pages, 4 figures, 12 tables
Azadeh Khaleghi, Lukas Zierahn
We introduce PyChEst, a Python package which provides tools for the simultaneous estimation of multiple changepoints in the distribution of piece-wise stationary time series. The nonparametric algorithms implemented are provably consistent in a general framework: when the samples are generated by unknown piece-wise stationary processes. In this setting, samples may have long-range dependencies of arbitrary form and the finite-dimensional marginals of any (unknown) fixed size before and after the changepoints may be the same. The strength of the algorithms included in the package is in their ability to consistently detect the changes without imposing any assumptions beyond stationarity on the underlying process distributions. We illustrate this distinguishing feature by comparing the performance of the package against state-of-the-art models designed for a setting where the samples are independently and identically distributed.
Yaxu Xie, Fangwen Shu, Jason Rambach, Alain Pagani, Didier Stricker
Piece-wise 3D planar reconstruction provides holistic scene understanding of
man-made environments, especially for indoor scenarios. Most recent approaches
focused on improving the segmentation and reconstruction results by introducing
advanced network architectures but overlooked the dual characteristics of
piece-wise planes as objects and geometric models. Different from other
existing approaches, we start from enforcing cross-task consistency for our
multi-task convolutional neural network, PlaneRecNet, which integrates a
single-stage instance segmentation network for piece-wise planar segmentation
and a depth decoder to reconstruct the scene from a single RGB image. To
achieve this, we introduce several novel loss functions (geometric constraint)
that jointly improve the accuracy of piece-wise planar segmentation and depth
estimation. Meanwhile, a novel Plane Prior Attention module is used to guide
depth estimation with the awareness of plane instances. Exhaustive experiments
are conducted in this work to validate the effectiveness and efficiency of our
method.
Authors' comments: accepted to BMVC 2021, code opensource:
https://github.com/EryiXie/PlaneRecNet
Mahsa N. Shirazi
Two perfect matchings $P$ and $Q$ of the complete graph on $2k$ vertices are
said to be set-wise $t$-intersecting if there exist edges $P_{1}, \cdots,
P_{t}$ in $P$ and $Q_{1}, \cdots, Q_{t}$ in $Q$ such that the union of edges
$P_{1}, \cdots, P_{t}$ has the same set of vertices as the union of $Q_{1},
\cdots, Q_{t}$ has. In this paper we prove an extension of the famous
Erd\H{o}s-Ko-Rado (EKR) theorem to set-wise $2$-intersecting families of
perfect matching on all values of $k$, and we conjecture similar statement for
all $t\geq 2$.
Authors' comments: arXiv admin note: text overlap with arXiv:2008.08503
Mohsen Fayyaz, Ehsan Aghazadeh, Ali Modarressi, Hosein Mohebbi, Mohammad Taher Pilehvar
Most of the recent works on probing representations have focused on BERT,
with the presumption that the findings might be similar to the other models. In
this work, we extend the probing studies to two other models in the family,
namely ELECTRA and XLNet, showing that variations in the pre-training
objectives or architectural choices can result in different behaviors in
encoding linguistic information in the representations. Most notably, we
observe that ELECTRA tends to encode linguistic knowledge in the deeper layers,
whereas XLNet instead concentrates that in the earlier layers. Also, the former
model undergoes a slight change during fine-tuning, whereas the latter
experiences significant adjustments. Moreover, we show that drawing conclusions
based on the weight mixing evaluation strategy -- which is widely used in the
context of layer-wise probing -- can be misleading given the norm disparity of
the representations across different layers. Instead, we adopt an alternative
information-theoretic probing with minimum description length, which has
recently been proven to provide more reliable and informative results.
Authors' comments: Accepted to BlackboxNLP Workshop at EMNLP 2021
Hyungtae Lim, Minho Oh, Hyun Myung
Ground segmentation is crucial for terrestrial mobile platforms to perform navigation or neighboring object recognition. Unfortunately, the ground is not flat, as it features steep slopes; bumpy roads; or objects, such as curbs, flower beds, and so forth. To tackle the problem, this paper presents a novel ground segmentation method called \textit{Patchwork}, which is robust for addressing the under-segmentation problem and operates at more than 40 Hz. In this paper, a point cloud is encoded into a Concentric Zone Model-based representation to assign an appropriate density of cloud points among bins in a way that is not computationally complex. This is followed by Region-wise Ground Plane Fitting, which is performed to estimate the partial ground for each bin. Finally, Ground Likelihood Estimation is introduced to dramatically reduce false positives. As experimentally verified on SemanticKITTI and rough terrain datasets, our proposed method yields promising performance compared with the state-of-the-art methods, showing faster speed compared with existing plane fitting--based methods. Code is available: https://github.com/LimHyungTae/patchwork
R. Scott Barrows, Julia M. Comerford, Daniel Stern, Roberto J. Assef
We present a catalog of physical properties for galaxies hosting active
galactic nuclei (AGN) detected by the Wide-field Infrared Survey Explorer
(WISE). By fitting broadband spectral energy distributions of sources in the
WISE AGN Catalog (Assef et al. 2018) with empirical galaxy and AGN templates,
we derive photometric redshifts, AGN bolometric luminosities, measures of AGN
obscuration, host galaxy stellar masses, and host galaxy star formation rates
(SFRs) for 695,273 WISE AGN. The wide-area nature of this catalog significantly
augments the known number of obscured AGN out to redshifts of z~3 and will be
useful for studies focused on AGN or their host galaxy physical properties. We
first show that the most likely non-AGN contaminants are galaxies at redshifts
of z=0.2-0.3, with relatively blue W1-W2 colors, and with high specific SFRs
for which the dust continuum emission is elevated in the W2 filter. Toward
increasingly lower redshifts, WISE AGN host galaxies have systematically lower
specific SFRs, relative to those of normal star forming galaxies, likely due to
decreased cold gas fractions and the time delay between global star formation
and AGN triggering. Finally, WISE AGN obscuration is not strongly correlated
with AGN bolometric luminosity but shows a significant negative correlation
with Eddington ratio. This result is consistent with a version of the `receding
torus' model in which the obscuring material is located within the supermassive
black hole gravitational sphere of influence and the dust inner radius
increases due to radiation pressure.
Authors' comments: 21 pages, 17 figures. Accepted for publication in the Astrophysical
Journal. The full catalog is available from the publisher or from the
corresponding author upon request
Yoshiki Toba, Teng Liu, Tanya Urrutia, Mara Salvato, Junyao Li, Yoshihiro Ueda, Marcella Brusa, Naomichi Yutani et al.
We investigate the physical properties--such as the stellar mass, SFR, IR
luminosity, X-ray luminosity, and hydrogen column density--of MIR galaxies and
AGN at $z < 4$ in the 140 deg$^2$ field observed by SRG/eROSITA through the
eFEDS survey. By cross-matching the WISE 22 $\mu$m (W4)-detected sample and the
eFEDS X-ray point-source catalog, we find that 692 extragalactic objects are
detected by eROSITA. We have compiled a multiwavelength dataset. We have also
performed (i) an X-ray spectral analysis, (ii) SED fitting using X-CIGALE,
(iii) 2D image-decomposition analysis using Subaru HSC images, and (iv) optical
spectral fitting with QSFit to investigate the AGN and host-galaxy properties.
For 7,088 WISE W4 objects that are undetected by eROSITA, we have performed an
X-ray stacking analysis to examine the typical physical properties of these
X-ray faint and/or probably obscured objects. We find that (i) 82% of the
eFEDS-W4 sources are classified as X-ray AGN with $\log\,L_{\rm X} >$ 42 erg
s$^{-1}$; (ii) 67% and 24% of the objects have $\log\,(L_{\rm IR}/L_{\odot}) >
12$ and 13, respectively; (iii) the relationship between $L_{\rm X}$ and the 6
$\mu$m luminosity is consistent with that reported in previous works; and (iv)
the relationship between the Eddington ratio and $N_{\rm H}$ for the eFEDS-W4
sample and a comparison with a model prediction from a galaxy-merger simulation
indicates that approximately 5% of the eFEDS-W4 sources in our sample are
likely to be in an AGN-feedback phase, in which strong radiation pressure from
the AGN blows out the surrounding material from the nuclear region. Thanks to
the wide area coverage of eFEDS, we have been able to constrain the ranges of
the physical properties of the WISE W4 sample of AGNs at $z < 4$, providing a
benchmark for forthcoming studies on a complete census of MIR galaxies selected
from the full-depth eROSITA all-sky survey.
Authors' comments: 18 pages, 19 figures, and 3 tables, accepted to appear on A&A,
Special Issue: The Early Data Release of eROSITA and Mikhail Pavlinsky ART-XC
on the SRG Mission
Stanislav Beliaev, Boris Ginsburg
We propose TalkNet, a non-autoregressive convolutional neural model for
speech synthesis with explicit pitch and duration prediction. The model
consists of three feed-forward convolutional networks. The first network
predicts grapheme durations. An input text is expanded by repeating each symbol
according to the predicted duration. The second network predicts pitch value
for every mel frame. The third network generates a mel-spectrogram from the
expanded text conditioned on predicted pitch. All networks are based on 1D
depth-wise separable convolutional architecture. The explicit duration
prediction eliminates word skipping and repeating. The quality of the generated
speech nearly matches the best auto-regressive models - TalkNet trained on the
LJSpeech dataset got MOS 4.08. The model has only 13.2M parameters, almost 2x
less than the present state-of-the-art text-to-speech models. The
non-autoregressive architecture allows for fast training and inference. The
small model size and fast inference make the TalkNet an attractive candidate
for embedded speech synthesis.
Authors' comments: arXiv admin note: substantial text overlap with arXiv:2005.05514
Zhong Zheng, Aditya A. Ghodgaonkar, Ivan C. Christov
We study the spreading and leveling of a gravity current in a Hele-Shaw cell
with flow-wise width variations as an analog for flow {in fractures and
horizontally heterogeneous aquifers}. Using phase-plane analysis, we obtain
second-kind self-similar solutions to describe the evolution of the gravity
current's shape during both the spreading (pre-closure) and leveling
(post-closure) regimes. The self-similar theory is compared to numerical
simulations of the partial differential equation governing the evolution of the
current's shape (under the lubrication approximation) and to table-top
experiments. Specifically, simulations of the governing partial differential
equation from lubrication theory allow us to compute a pre-factor, which is
\textit{a priori} arbitrary in the second-kind self-similar transformation, by
estimating the time required for the current to enter the self-similar regime.
With this pre-factor calculated, we show that theory, simulations and
experiments agree well near the propagating front. In the leveling regime, the
current's memory resets, and another self-similar behavior emerges after an
adjustment time, which we estimate from simulations. Once again, with the
pre-factor calculated, both simulations and experiments are shown to obey the
predicted self-similar scalings. For both the pre- and post-closure regimes, we
provide detailed asymptotic (analytical) characterization of the universal
current profiles that arise as self-similarity of the second kind.
Authors' comments: 33 pages, 7 figures, REVTeX 4-2; v2,v3: minor revision; accepted for
publication in Physical Review Fluids
Gitika Shukla, Raghunathan Srianand, Neeraj Gupta, Patrick Petitjean, Andrew J. Baker, Jens-Kristian Krogager, Pasquier Noterdaeme
We report the detection of a large ($\sim90$ kpc) and luminous
$\mathrm{Ly\alpha}$ nebula [$L\mathrm{_{Ly\alpha}}$ = $(6.80\pm0.08)\times
10^{44}$] $\rm{\,erg\,s^{-1}}$ around an optically faint (r$>23$ mag) radio
galaxy M1513-2524 at $z\mathrm{_{em}}$=3.132. The double-lobed radio emission
has an extent of 184 kpc, but the radio core, i.e., emission associated with
the active galactic nucleus (AGN) itself, is barely detected. This object was
found as part of our survey to identify high-$z$ quasars based on Wide-field
Infrared Survey Explorer (WISE) colors. The optical spectrum has revealed
$\mathrm{Ly\alpha}$, NV, CIV and HeII emission lines with a very weak
continuum. Based on long-slit spectroscopy and narrow band imaging centered on
the $\mathrm{Ly\alpha}$ emission, we identify two spatial components: a
"compact component" with high velocity dispersion ($\sim
1500$$\rm{\,km\,s^{-1}}$) seen in all three lines, and an "extended component",
having low velocity dispersion (i.e., 700-1000$\rm{\,km\,s^{-1}}$). The
emission line ratios are consistent with the compact component being in
photoionization equilibrium with an AGN. We also detect spatially extended
associated $\mathrm{Ly\alpha}$ absorption, which is blue-shifted within
250-400$\rm{\,km\,s^{-1}}$ of the $\mathrm{Ly\alpha}$ peak. The probability of
$\mathrm{Ly\alpha}$ absorption detection in such large radio sources is found
to be low ($\sim$10%) in the literature. M1513-2524 belongs to the top few
percent of the population in terms of $\mathrm{Ly\alpha}$ and radio
luminosities. Deep integral field spectroscopy is essential for probing this
interesting source and its surroundings in more detail.
Authors' comments: 19 pages, 15 figures, Accepted for publication in MNRAS
Amanpreet Kaur, Abraham D. Falcone, Michael C. Stroh
We utilize machine learning methods to distinguish BL Lacertae objects (BL
Lac) from Flat Spectrum Radio Quasars (FSRQ) within a sample of likely X-ray
blazar counterparts to Fermi 3FGL unassociated gamma-ray sources. From our
previous work, we have extracted 84 sources that were classified as $\geq$ 99%
likley to be blazars. We then utilize Swift$-$XRT, Fermi, and WISE (The
Wide-field Infrared Survey Explorer) data together to distinguish the specific
type of blazar, FSRQs or BL Lacs. Various X-ray and Gamma-ray parameters can be
used to differentiate between these subclasses. These are also known to occupy
different parameter space on the WISE color-color diagram. Using all these data
together would provide more robust results for the classified sources. We
utilized a Random Forest Classifier to calculate the probability for each
blazar to be associated with a BL Lac or an FSRQ. Based on P$_{bll}$, which is
the probability for each source to be a BL Lac, we placed our sources into five
different categories based on this value as follows; P$_{bll}$ $\geq$ 99%:
highly likely BL Lac, P$_{bll}$ $\geq$ 90%: likely BL Lac, P$_{bll}$ $\leq$ 1%:
highly likely FSRQ, P$_{bll}$ $\leq$ 10%: likely FSRQ, and 90% $<$ P$_{bll}$
$<$ 10%: ambiguous. Our results categorize the 84 blazar candidates as 50
likely BL Lacs and the rest 34 being ambiguous. A small subset of these sources
have been listed as associated sources in the most recent Fermi catalog, 4FGL,
and in these cases our results are in agreement on the classification.
Authors' comments: 13 pages, 4 figures, 2 tables, accepted for publication in AJ
Elisabeth C. Matthews, Sasha Hinkley, Karl Stapelfeldt, Arthur Vigan, Dimitri Mawet, Ian J. M. Crossfield, Trevor J. David, Eric Mamajek et al.
Debris disk stars are good targets for high contrast imaging searches for
planetary systems, since debris disks have been shown to have a tentative
correlation with giant planets. We selected 20 stars identified as debris disk
hosts by the WSIE mission, with particularly high levels of warm dust. We
observed these with the VLT/SPHERE high contrast imaging instrument with the
goal of finding planets and imaging the disks in scattered light. Our survey
reaches a median 5$\sigma$~sensitivity of 10.4Mj at 25au and 5.9Mj at 100au. We
identified three new stellar companions (HD18378B, HD19257B and HD133778B): two
are mid-M type stars and one is late-K or early-M star. Three additional stars
have very widely separated stellar companions (all at $>$2000au) identified in
the Gaia catalog. The stars hosting the three SPHERE-identified companions are
all older ($\gtrsim$700Myr), with one having recently left the main sequence
and one a giant star. We infer that the high volumes of dust observed around
these stars might have been caused by a recent collision between the planets
and planetesimal belts in the system, although for the most evolved star, mass
loss could also be responsible for the infrared excess. Future mid-IR
spectroscopy or polarimetric imaging may allow the positions and spatial extent
of these dust belts to be constrained, thereby providing evidence as to the
true cause of the elevated levels of dust around these old systems. None of the
disks in this survey are resolved in scattered light.
Authors' comments: Accepted to AJ
Evgenii Chzhen, Nicolas Schreuder
Let $(X, S, Y) \in \mathbb{R}^p \times \{1, 2\} \times \mathbb{R}$ be a
triplet following some joint distribution $\mathbb{P}$ with feature vector $X$,
sensitive attribute $S$ , and target variable $Y$. The Bayes optimal prediction
$f^*$ which does not produce Disparate Treatment is defined as $f^*(x) =
\mathbb{E}[Y | X = x]$. We provide a non-trivial example of a prediction $x \to
f(x)$ which satisfies two common group-fairness notions: Demographic Parity
\begin{align} (f(X) | S = 1) &\stackrel{d}{=} (f(X) | S = 2) \end{align} and
Equal Group-Wise Risks \begin{align}
\mathbb{E}[(f^*(X) - f(X))^2 | S = 1] = \mathbb{E}[(f^*(X) - f(X))^2 | S =
2]. \end{align} To the best of our knowledge this is the first explicit
construction of a non-constant predictor satisfying the above. We discuss
several implications of this result on better understanding of mathematical
notions of algorithmic fairness.
Authors' comments: Presented at the NeurIPS 2020 Workshop on Algorithmic Fairness
through the Lens of Causality and Interpretability
Ryusuke Sagawa, Yusuke Higuchi, Hiroshi Kawasaki, Ryo Furukawa, Takahiro Ito
This paper proposes a method of estimating micro-motion of an object at each
pixel that is too small to detect under a common setup of camera and
illumination. The method introduces an active-lighting approach to make the
motion visually detectable. The approach is based on speckle pattern, which is
produced by the mutual interference of laser light on object's surface and
continuously changes its appearance according to the out-of-plane motion of the
surface. In addition, speckle pattern becomes uncorrelated with large motion.
To compensate such micro- and large motion, the method estimates the motion
parameters up to scale at each pixel by nonlinear embedding of the speckle
pattern into low-dimensional space. The out-of-plane motion is calculated by
making the motion parameters spatially consistent across the image. In the
experiments, the proposed method is compared with other measuring devices to
prove the effectiveness of the method.
Authors' comments: to be published in ACCV2020
Shaolin Ji, Rundong Xu
This paper examines the stochastic maximum principle (SMP) for a
forward-backward stochastic control system where the backward state equation is
characterized by the backward stochastic differential equation (BSDE) with
quadratic growth and the forward state at the terminal time is constrained in a
convex set with probability one. With the help of the theory of BSDEs with
quadratic growth and the bounded mean oscillation (BMO) martingales, we employ
the terminal perturbation approach and Ekeland's variational principle to
obtain a dynamic stochastic maximum principle. The main result has a wide range
of applications in mathematical finance and we investigate a robust recursive
utility maximization problem with bankruptcy prohibition as an example.
Authors' comments: 24 pages
H. F. M. Yao, T. H. Jarrett, M. E. Cluver, L. Marchetti, Edward N. Taylor, M. G. Santos, Matt S. Owers, Angel R. Lopez-Sanchez et al.
We present a detailed study of emission-line systems in the GAMA G23 region,
making use of $\textit{WISE}$ photometry that includes carefully measured
resolved sources. After applying several cuts to the initial catalogue of
$\sim$41,000 galaxies, we extract a sample of 9,809 galaxies. We then compare
the spectral diagnostic (BPT) classification of 1154 emission-line galaxies
(38$\%$ resolved in W1) to their location in the $\textit{WISE}$ colour-colour
diagram, leading to the creation of a new zone for mid-infrared "warm" galaxies
located 2$\sigma$ above the star-forming sequence, below the standard
$\textit{WISE}$ AGN region. We find that the BPT and $\textit{WISE}$ diagrams
agree on the classification for 85$\%$ and 8$\%$ of the galaxies as non-AGN
(star forming = SF) and AGN, respectively, and disagree on $\sim$7$\%$ of the
entire classified sample. 39$\%$ of the AGN (all types) are broad-line systems
for which the [\ion{N}{ii}] and [H$\alpha$] fluxes can barely be disentangled,
giving in most cases spurious [\ion{N}{ii}]/[H$\alpha$] flux ratios. However,
several optical AGN appear to be completely consistent with SF in
$\textit{WISE}$. We argue that these could be low power AGN, or systems whose
hosts dominate the IR emission. Alternatively, given the sometimes high
[\ion{O}{iii}] luminosity in these galaxies, the emission lines may be
generated by shocks coming from super-winds associated with SF rather than the
AGN activity. Based on our findings, we have created a new diagnostic: [W1-W2]
vs [\ion{N}{ii}]/[H$\alpha$], which has the virtue of separating SF from AGN
and high-excitation sources. It classifies 3$\sim$5 times more galaxies than
the classic BPT
Authors' comments: 43 pages, 32 figures, 4 tables