Volltext-Downloads (blau) und Frontdoor-Views (grau)

Extracting specialized terminology from linguistic corpora

  • In this paper, we present our approach to automatically extracting German terminology in the domain of grammar using texts from the online information system grammis as our corpus. We analyze existing repositories of German grammatical terminology and develop Part-of-speech patterns for our extraction thereby showing the importance of unigrams in this domain. We contrast the results of the automatic extraction with a manually extracted standard. By comparing the performance of well-known statistical measures, we show how measures based on corpus comparison outperform alternative methods.

Export metadata

Additional Services

Share in Twitter Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Christian Lang, Roman Schneider, Karolina SuchowolecGND
URN:urn:nbn:de:bsz:mh39-74760
DOI:https://doi.org/10.17885/heiup.361.509
ISBN:978-3-946054-82-5
Parent Title (English):Grammar and corpora 2016
Publisher:Heidelberg University Publishing
Place of publication:Heidelberg
Editor:Eric Fuß, Marek Konopka, Beata Trawiński, Ulrich Hermann Waßner
Document Type:Part of a Book
Language:English
Year of first Publication:2018
Date of Publication (online):2018/05/23
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
Tag:automatic term extraction; grammatical information system; grammatical terminology; terminological structurer
GND Keyword:Automatische Sprachverarbeitung; Deutsch; Grammatik; Grammis; Terminologie
First Page:425
Last Page:434
Dewey Decimal Classification:400 Sprache / 400 Sprache, Linguistik
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Computerlinguistik
Open Access?:Ja
Licence (German):License LogoCreative Commons - Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International