TY - CHAP U1 - Konferenzveröffentlichung A1 - Wiegand, Michael A1 - Ruppenhofer, Josef A1 - Kleinbauer, Thomas ED - Burstein, Jill ED - Doran, Christy ED - Solorio, Thamar T1 - Detection of abusive language: the problem of biased datasets T2 - The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the Conference Vol. 1. Minneapolis, Minnesota, June 2 - June 7, 2019 N2 - We discuss the impact of data bias on abusive language detection. We show that classification scores on popular datasets reported in previous work are much lower under realistic settings in which this bias is reduced. Such biases are most notably observed on datasets that are created by focused sampling instead of random sampling. Datasets with a higher proportion of implicit abuse are more affected than datasets with a lower proportion. KW - Schimpfwort KW - Beleidigung KW - Verbalagression KW - Automatische Sprachanalyse Y1 - 2019 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-90165 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-90165 UR - https://www.aclweb.org/anthology/N19-1060 SN - 978-1-950737-13-0 SB - 978-1-950737-13-0 SP - 602 EP - 608 PB - The Association for Computational Linguistics CY - Stroudsburg, PA, USA ER -