OPUS 4 | Search

Computergestützte Lesartendisambiguierung (2003)

Mit diesem Papier sollen LexikografInnen an ein Automatisierungstool der Textanalyse innerhalb der Korpuslinguistik herangeführt werden. Das am IDS entwickelte statistische Recherche- und Analysewerkzeug Cosmas bietet neue Zugänge zur Gewinnung semantischer Informationen über Wörter. Die Nutzungsmöglichkeiten dieses Instrumentariums für die Lesartendisambiguierung von Lexemen und deren Verifizierung mittels Kollokations- und Kontextanalyse werden erläutert, und anhand des Beispiels cool wird gezeigt, inwieweit sich semantische Informationen durch automatische Statistik extrahieren lassen. Dabei wird auf die Vor- und Nachteile der computerbasierten Analyse eingegangen. Darüber hinaus wird dargestellt, wie empirische lexikografische Disambiguierung modellgeleitet validiert werden kann. Um die Unterschiede zwischen herkömmlichen Beschreibungsmöglichkeiten und neuen statistischen Verfahren zu verdeutlichen, werden die Lesarten zu cool, wie sie im Duden GWDS (2000) dargestellt sind, mit den identifizierten Lesarten der Analyse mit Cosmas verglichen.

The lexicographic use of corpora and computational tools for disambiguation (2003)

Storjohann, Petra

frau auf dem linguistischen Prüfstand: Eine korpusgestützte Gebrauchsanalyse feministischer Indefinitpronomen (2004)

Storjohann, Petra

In diesem Beitrag soll der Frage nachgegangen werden, ob sich feministische Indefinitpronomen, insbesondere die Neuschöpfung frau, in ihrem Gebrauch außerhalb der feministischen Sprachbetrachtung und außerhalb frauenspezifischer Diskurse etabliert haben. Auf der Basis der IDS-Korpora wird der öffentliche Sprachgebrauch neuer Pronomen hinsichtlich ihres Vorkommens sowohl quantitativ als auch qualitativ ausgewertet, um Aussagen zur Gebrauchsentwicklung treffen zu können. Mithilfe eines korpusanalytischen Werkzeugs werden linguistische Strukturen aufgedeckt, die Verwendungstypisches im Gebrauch des Lexems frau illustrieren. Besonderes Augenmerk erhält die diachrone Untersuchung der kontextuellen Einbettungen des Lexems frau. Dabei spielt sowohl die Extrahierung syntagmatischer Mitspieler mithilfe der softwaregestützten Kollokationsanalyse als auch die linguistische Analyse der Relationen zwischen Begleitwort und Suchwort eine besondere Rolle. Darüber hinaus sollen auch pragmatische und syntaktische Aspekte eruiert sowie Fragestellungen der allgemeinsprachlichen Bewertung feministischer Indefinitpronomen nachgegangen werden.

Semantische Paraphrasen und Kurzetikettierungen (2005)

Storjohann, Petra

Das elexiko-Korpus: Aufbau und Zusammensetzung (2005)

Storjohann, Petra

Paradigmatische Relationen (2005)

Storjohann, Petra

Diachrone Angaben (2005)

Storjohann, Petra

Typische Verwendungen (2005)

Storjohann, Petra

Corpus-driven vs. corpus-based approach to the study of relational patterns (2005)

Storjohann, Petra

Contextual lexical relations, such as sense relations, have traditionally played an essential role in disambiguating word senses in lexicography, as they offer insights into the meaning and use of a word. However, the description of paradigmatic relations in particular is often restricted to a few types such as synonymy and antonymy. The limited description of various types of relations and the method of presenting these relations in existing German dictionaries are often problematic. Elexiko, the first German hypertext dictionary compiled exclusively on the basis of an electronic corpus, offers a new way of presenting sense relations, using a variety of approaches to extract the necessary data. In this paper, I will show how elexiko presents a differentiated system of paradigmatic relations including synonymy, various subtypes of incompatibility (such as antonymy, complementarity, converseness, reversiveness, etc.), and vertical structures (such as hyponymy and meronymy). Primary attention, however, will focus on the question of how data for a paradigmatic description is retrieved from the corpus. Whereas a corpus-driven approach is mainly used for various semantic information and a corpus-based method plays an important part in obtaining data for the grammatical description in elexiko, it will be argued that both the corpus-driven and the corpus-based approach can be complementary methods in gaining insights into sense relations. I will demonstrate which results can be obtained by each approach, and advantages and disadvantages of both procedures will be explored in more detail. As sense relations are context-dependent, it will also be demonstrated how a sense-bound presentation can be realised in an electronic reference work including a system of cross-referencing that illustrates lexical structures and the interrelatedness of words within the lexicon. Finally, I will show how accompanying examples from the corpus and additional lexicographic information help the user to understand contextual restrictions, so that s/he is able to use dictionary information more effectively.

Sinnrelationen in Wörterbüchern - Neue Ansätze und Perspektiven (2005)

Storjohann, Petra

Kontextuelle lexikalische Relationen, insbesondere Sinnrelationen, sind für Sprachinteressierte bei der Textproduktion von besonderem Interesse. Dennoch sind Informationen über diese Wortschatzstrukturen in vielen einsprachigen Wörterbüchern häufig auf Angaben der Synonymie oder Antonymie beschränkt und ihre Beschreibung bzw. Darstellung nur bedingt nutzbar. ELEXIKO, das erste Internetwörterbuch und Informationssystem der deutschen Gegenwartssprache, das ausschließlich korpusgestützt erarbeitet wird, bietet eine differenziertere Präsentation und Beschreibung paradigmatischer Relationen und nutzt unterschiedliche korpusgestützte Verfahren, um sprachliche Daten aus dem zugrunde liegenden Korpus zu extrahieren. Diese Verfahren bringen z. T. neue Erkenntnisse über Wortschatzstrukturen, für die in der Lexikografie nach neuen Beschreibungs- und Darstellungsformen gesucht werden muss. Dieser Beitrag beschäftigt sich mit folgenden Fragen: Welche Vorteile bietet die korpusgestützte Lexikografie hinsichtlich der Untersuchung paradigmatischer Sinnrelationen und wie setzt ELEXIKO Erkenntnisse korpusgestützter Studien lexikografisch um? Welche wesentlichen Unterschiede gibt es zu anderen Wörterbüchern, die Wortschatzstrukturen beschreiben? Kritisch werden vor allem folgende Aspekte untersucht: Wie bedeutungsgleich sind Synonyme und wie gegensätzlich sind die in Antonymiewörterbüchern gebuchten Gegensatzwörter wirklich?

elexiko – A Corpus-Based Monolingual German Dictionary (2005)

Storjohann, Petra

This article provides an introduction to elexiko, the ﬁrst German hypertext dictionary to be compiled on a corpus basis, which is currently being developed at the Institut für Deutsche Sprache Mannheim (IDS). First, a brief account of the design is given, followed by a demonstration of the methods and tools that are being employed to compile it. elexiko will provide not only an improved quantity of lexical information, but also a new quality of information which will be explained and illustrated at different levels of the microstructure of the dictionary. The description of word meaning and use in elexiko will be presented in detail, with a particular focus on the treatment of collocations, ambiguity, vagueness, and the presentation of senses. The development of a theoretically grounded procedure for lexicographic disambiguation is also described. This is then followed by a brief account of the treatment of grammatical details. Finally, issues of usability, the progress of the project and its future perspectives will be considered.

Kontextuelle Variabilität synonymer Relationen (2006)

Storjohann, Petra

Dieser Beitrag beleuchtet lexikalische Ausdrücke näher, die in einer Lesart durch eine synonyme Relation verbunden sind. Im Vordergrund steht die korpusgestützte Untersuchung paradigmatischer Kontextanpassung dieser Relationspaare. Es wird gezeigt, wie diese Sinnrelation insbesondere innerhalb einer Lesart kontextuell variieren kann oder spezifiziert wird und wie anhand von Korpusdaten diese variierenden Strukturen lexikologisch erfasst und lexikografisch beschrieben werden können. Diese Beobachtungen entstanden auf der Basis der Wörterbucharbeit im Projekt elexiko und stellen erste Ergebnisse hinsichtlich variabler paradigmatischer Strukturen dar, die auf der Basis eines umfangreichen Korpus, des für lexikografische Zwecke zusammengestellten elexiko Korpus,gewonnen wurden Es wird dargestellt, wie Korpusbeobachtungen hinsichtlich synonymer Variabilität im Projekt elexiko lexikografisch umgesetzt werden. Dabei soll verdeutlicht werden, wie man ein Synonymwörterbuch gebrauchsorientierter gestalten kann, wie sich neu gewonnene Korpuserkenntnisse lexikografisch einarbeiten lassen und wie dabei gleichzeitig nach angemessenen Präsentationsformen gesucht werden muss.

Korpora als Schlüssel zur lexikografischen Überarbeitung : der neue Dornseiff (2006)

Storjohann, Petra

Mit digitalem Textmaterial die innere Ordnung des Wortschatzes entdecken (2006)

Storjohann, Petra

Elektronische Quellen zur deutschen Sprache (2006)

Haß, Ulrike ; Storjohann, Petra

Ausblicke in die Wortvergangenheit (2006)

Storjohann, Petra

ELEXIKO - A lexical and lexicological, corpus-based hypertext information system at the Institut für Deusche Sprache (2006)

Klosa, Annette ; Schnörch, Ulrich ; Storjohann, Petra

ELEXIKO is a relatively new lexicological-lexicographic project based at the Institut fiir Deutsche Sprache (IDS) in Mannheim. The project compiles a reference work that explains and documents contemporary German; it was specifically designed for online publication (www.elexiko.de). The primary and exclusive basis for lexicographic interpretation is an extensive German corpus. If one refers to elexiko as an Internet dictionary, it is purely for practical reasons, elexiko is (far) more than a dictionary in its traditional sense, although, of course, it contains descriptions of the meaning and use of a lexeme just as any traditional dictionary. It is both, a hypertext dictionary and a lexical data information system.

New Lexicographic Approaches to the Description of Sense Relations (2006)

Storjohann, Petra

The presentation and description of paradigmatic sense relations in German dictionaries is often limited to types such as synonymy and antonymy. Their information is neither well presented nor helpful for users. Although corpora offer fundamental methodological advantages, various corpus-guided approaches have not played an important role in extracting and describing paradigmatic relations in German lexicography so far. elexiko is a hypertext dictionary that explores a corpus to extract language data for the description of paradigmatic lexical relations. 1 will show how sense relations can be extracted systematically by employing both a corpus-driven and a complementary corpus-based approach. I will demonstrate how corpus data validates or challenges information in existing dictionaries and that in some cases lexicographic categories are not appropriate to capture specific linguistic phenomena with respect to sense-related items. Subsequently, an alternative method of extracting, describing, and presenting sense relations will be presented.

Der Diskurs "Globalisierung" in der öffentlichen Sprache: Eine korpusgestützte Analyse kontextueller Thematisierungen (2007)

Storjohann, Petra

Incompatibility: A no-sense relation? (2007)

Storjohann, Petra

Incompatibility (or co-hyponymy) is the most general type of semantic relation between lexical items, the meaning of which entails exclusion. Such items fall under a superordinate term or concept and denote sets which have no members in common (e.g. animal: dog-cat-mouse-lion-sheep; example from Cruse 2004). Traditionally, these have been of interest to lexical semanticists for the description of the structure of the lexicon. However, incompatibility is not just a relation that signifies a difference of meaning. This paper is a critical corpus-assisted re-evaluation of the phenomenon of incompatibility which argues that the relation in question sometimes also functions as a discourse marker. Incompatibles indicate recurrent intertextual patterns. This holds particularly true for socially or politically controversial lexical items such as Flexibilität (flexibility), Mobilität (mobility) or Globalisierung (globalisation). Corpus investigations of such words have revealed that among other semantically related terms, incompatibles have a crucial discourse focussing function. For the German lexical item Globalisierung, I will show how its lexical usage can be studied through a corpus-driven analysis of corresponding incompatibles. Incompatible terms are not contingent co-words but often occur in close contextual proximity and participate in regular syntagmatic structures (e.g. Globalisierung und Rationalisierung; Globalisierung und Modernisierung; Neoliberalismus, Globalisierung und Kapitalismus). Hence, these are easily extracted by conducting a computational collocation analysis. Such significant collocates provide a good insight into the discursive and thematic contexts of the search word. Following Teubert (2004), I will demonstrate how the meaning of such lexical items is constituted in discourse and how the examination of these particular collocates reveals their sense-constructing function and their pragmatic-discursive force. I will provide a brief discussion of the methodology used for such analyses, and I will explain why the complex semantic-pragmatic and thematic-communicative patterns implied in sets of incompatibles should be given a stronger emphasis in lexicography.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

90 search hits