Refine
Year of publication
Document Type
- Part of a Book (11)
- Article (2)
- Conference Proceeding (2)
Keywords
- Deutsch (4)
- Korpus <Linguistik> (4)
- Wörterbuch (3)
- Computerlinguistik (2)
- Computerunterstützte Lexikographie (2)
- Mehrworteinheit (2)
- Texttechnologie (2)
- XML (2)
- Abfragesprache (1)
- Abfragesystem (1)
Publicationstate
- Zweitveröffentlichung (4)
- Veröffentlichungsversion (3)
- Postprint (2)
Reviewstate
- (Verlags)-Lektorat (5)
- Peer-Review (1)
- Peer-review (1)
Publisher
- Wilhelm Fink (2)
- de Gruyter (2)
- De Gruyter (1)
- De Gruyter Mouton (1)
- European Language Resources Association (ELRA) (1)
- IDS-Verlag (1)
- Institut für Deutsche Sprache (1)
- Narr Francke Attempto (1)
- Schwann (1)
- Stauffenburg Verlag (1)
This paper describes an approach to modelling a general-language wordnet, GermaNet, and a domain-specific wordnet, TermNet, in the web ontology language OWL. While the modelling process for GermaNet adopts relevant recommendations with respect to the English Princeton WordNet, for Term-Net an alternative modelling concept is developed that considers the special characteristics of domain-specific terminologies. We present a proposal for linking a general-language wordnet and a terminological wordnet within the framework of OWL and on this basis discuss problems and alternative modelling approaches.
The paper presents an XML schema for the representation of genres of computer-mediated communication (CMC) that is compliant with the encoding framework defined by the TEI. It was designed for the annotation of CMC documents in the project Deutsches Referenzkorpus zur internetbasierten Kommunikation (DeRiK), which aims at building a corpus on language use in the most popular CMC genres on the German-speaking Internet. The focus of the schema is on those CMC genres which are written and dialogic―such as forums, bulletin boards, chats, instant messaging, wiki and weblog discussions, microblogging on Twitter, and conversation on “social network” sites.
The schema provides a representation format for the main structural features of CMC discourse as well as elements for the annotation of those units regarded as “typical” for language use on the Internet. The schema introduces an element <posting>, which describes stretches of text that are sent to the server by a user at a certain point in time. Postings are the main constituting elements of threads and logfiles, which, in our schema, are the two main types of CMC macrostructures. For the microlevel of CMC documents (that is, the structure of the <posting> content), the schema introduces elements for selected features of Internet jargon such as emoticons, interaction words and addressing terms. It allows for easy anonymization of CMC data for purposes in which the annotated data are made publicly available and includes metadata which are necessary for referencing random excerpts from the data as references in dictionary entries or as results of corpus queries.
Documentation of the schema as well as encoding examples can be retrieved from the web at http://www.empirikom.net/bin/view/Themen/CmcTEI. The schema is meant to be a core model for representing CMC that can be modified and extended by others according to their own specific perspectives on CMC data. It could be a first step towards an integration of features for the representation of CMC genres into a future new version of the TEI Guidelines.
In diesem Beitrag geht es um Fragen der Benutzerführung in lexikografisch-lexikologischen Portalen, und zwar inbesondere um die Portale OWID (Mannheim) und „Wörterbuch-Portal“ (Berlin). Diese werden mit ihrer jeweiligen Konzeption sowie ihrem technischen Aufbau vorgestellt und dann aus Benutzersicht bewertet. An Vorschläge für die Weiterentwicklung dieser Angebote schließen sich einige grundsätzliche Überlegungen zur Zukunft lexikografischer Portale an.