Identifying implicitly abusive remarks about identity groups using a linguistically informed approach
- We address the task of distinguishing implicitly abusive sentences on identity groups (“Muslims contaminate our planet”) from other group-related negative polar sentences (“Muslims despise terrorism”). Implicitly abusive language are utterances not conveyed by abusive words (e.g. “bimbo” or “scum”). So far, the detection of such utterances could not be properly addressed since existing datasets displaying a high degree of implicit abuse are fairly biased. Following the recently-proposed strategy to solve implicit abuse by separately addressing its different subtypes, we present a new focused and less biased dataset that consists of the subtype of atomic negative sentences about identity groups. For that task, we model components that each address one facet of such implicit abuse, i.e. depiction as perpetrators, aspectual classification and non-conformist views. The approach generalizes across different identity groups and languages.
Author: | Michael WiegandORCiDGND, Elisabeth Eder, Josef RuppenhoferORCiDGND |
---|---|
URN: | urn:nbn:de:bsz:mh39-112614 |
DOI: | https://doi.org/10.18653/v1/2022.naacl-main.410 |
ISBN: | 978-1-955917-71-1 |
Parent Title (English): | Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. July 10-15, 2022. |
Publisher: | Stroudsburg |
Place of publication: | Association for Computational Linguistics |
Editor: | Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2022 |
Date of Publication (online): | 2022/10/07 |
Publishing Institution: | Leibniz-Institut für Deutsche Sprache (IDS) |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
Tag: | abusive language; abusive remarks; identity groups |
GND Keyword: | Beleidigung; Beschimpfung; Computerlinguistik; Datensatz |
First Page: | 5600 |
Last Page: | 5612 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Computerlinguistik |
Program areas: | P2: Mündliche Korpora |
Licence (English): | ![]() |