OPUS 4 | Search

2 search hits

1 to 2

Sort by

Detection of abusive language: the problem of biased datasets (2019)

Wiegand, Michael ; Ruppenhofer, Josef ; Kleinbauer, Thomas

We discuss the impact of data bias on abusive language detection. We show that classification scores on popular datasets reported in previous work are much lower under realistic settings in which this bias is reduced. Such biases are most notably observed on datasets that are created by focused sampling instead of random sampling. Datasets with a higher proportion of implicit abuse are more affected than datasets with a lower proportion.

Detecting derogatory compounds – an unsupervised approach (2019)

Wiegand, Michael ; Wolf, Maximilian ; Ruppenhofer, Josef

We examine the new task of detecting derogatory compounds (e.g. curry muncher). Derogatory compounds are much more difficult to detect than derogatory unigrams (e.g. idiot) since they are more sparsely represented in lexical resources previously found effective for this task (e.g. Wiktionary). We propose an unsupervised classification approach that incorporates linguistic properties of compounds. It mostly depends on a simple distributional representation. We compare our approach against previously established methods proposed for extracting derogatory unigrams.

1 to 2

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

2 search hits