Refine
Year of publication
- 2003 (14) (remove)
Document Type
- Article (12)
- Conference Proceeding (2)
Has Fulltext
- yes (14)
Is part of the Bibliography
- no (14)
Keywords
- Deutsch (6)
- Computerlinguistik (3)
- Gesprochene Sprache (2)
- Korpus <Linguistik> (2)
- XML (2)
- Annotation (1)
- Automatische Sprachproduktion (1)
- Baumgrammatik (1)
- Bedrohte Sprache (1)
- Computerunterstützte Lexikographie (1)
Publicationstate
- Veröffentlichungsversion (8)
- Zweitveröffentlichung (3)
- Postprint (2)
Reviewstate
- Peer-Review (14) (remove)
Publisher
- Schmidt (3)
- Association for Computational Linguistics (2)
- De Gruyter Oldenbourg (1)
- Herder-Institut (1)
- Kluwer (1)
- Kossuth/Nodus (1)
- Metzler (1)
- Narr (1)
- Universität zu Köln (1)
Das hier vorgestellte Dissertationsvorhaben am Institut für Englische Philologie der Freien Universität Berlin möchte der Frage auf den Grund gehen, welche Veränderungen in der Sprachpolitik gegenüber bedrohten Sprachen durch die Einrichtung dezentraler Parlamente erreicht werden können. Untersucht wird die Sprachpolitik gegenüber der gälischen Sprache in Schottland sowie der samischen Sprache in Norwegen. Kern der Untersuchungen wird dabei die Frage sein, welche politischen Initiativen zur Sprachunterstützung es in den letzten Jahren gegeben hat. Insbesondere soll darauf eingegangen werden, dass es mit dem Schottischen Parlament sowie dem Sameting in Norwegen jetzt parlamentarische Vertretungen gibt, in denen die jeweilige Sprachgruppe wesentlich größeren Einfluss geltend machen kann als dies vormals der Fall war.
We present a light-weight tool for the annotation of linguistic data on multiple levels. It is based on the simplification of annotations to sets of markables having attributes and standing in certain relations to each other. We describe the main features of the tool, emphasizing its simplicity, customizability and versatility
We apply a decision tree based approach to pronoun resolution in spoken dialogue. Our system deals with pronouns with NP- and non-NP-antecedents. We present a set of features designed for pronoun resolution in spoken dialogue and determine the most promising features. We evaluate the system on twenty Switchboard dialogues and show that it compares well to Byron’s (2002) manually tuned system.
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in the perceptual quality of a text-to-speech system. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these models consisted of a symbolic representation which was either derived from a database or a text-to-speech system. Results of the perception experiments show that different duration models can only be distinguished when the symbolic representation is appropriate. Considering the relative importance of the symbolic representation, post-lexical segmental rules were investigated with the outcome that listeners differ in their preferences regarding the degree of segmental reduction. As a conclusion, before fine-tuning the duration prediction, it is important to derive an appropriate phonological symbolic representation in order to improve timing in synthetic speech.
Wohlgeformte XML-Dokumente lassen sich als Bäume interpretieren und diese wiederum durch Grammatiken beschreiben. Dokumentgrammatiken weisen einige Besonderheiten auf, die sie von Grammatiken für natürliche Sprachen oder Programmiersprachen unterscheidet. Dieser Beitrag erläutert die Verarbeitungsmöglichkeiten, die aus der Nutzung von formalen Dokumentgrammatiken erwachsen.
The paper investigates the evolution of document grammars from a linguistic point of view. Document grammars have been developed in the past decades in order to formalize knowledge on the structure of textual information. A well-known instance of a document grammar is the »Document Type Definition« (DTD) as part of the Extensible Markup Language (XML). DTDs allow to define so-called tree grammars that constrain the application of tag-sets in the process of annotation of a document. In an XML-based document workflow, DTDs play a crucial role for validation and transforming huge amounts of texts in standardized data formats. An interesting point in the development of XML DTDs is the fact that the restriction of the formal expressiveness paved the way to understand the formal properties of document grammars better and to develop more a powerful version like XML Schema recently. In this sense, the simplicity of the original approach, resulting from the necessary restriction of previous approaches, yielded new complexity on formally understood grounds.
This paper develops a theoretical model for the semantics of connectives, following central ideas of Reichenbachian tense semantics.
In a first step, the terminological and conceptual framework is presented and illustrated with German da. The meaning of a connective is modeled as a four-place-relation between the situated object E, a reference object R, a discourse anchor S and the speaker O. The relata can belong to one of four different classes of entities: physical object, event, proposition or act. Correspondingly, the relations are divided into four cognitive domains: space, time, alethics/epistemics, and deontics. In each domain, relations can be treated under three different perspectives: situation, condition or causation. A cross-classification of relational domains and perspectives provides a typology of connectives which is more consistent than the ones available in traditional grammar.
In the second part of the article, the analytic apparatus is refined, using German so as the main example. Following Roman Jakobson, a distinction is made between contiguity and similarity relations. Contiguity relations are typically encoded by functional categories, whereas similarity relations are encoded by lexical categories. However, there are a few connectives like so which encode similarity relations. A structural isomorphism between similarity and contiguity relations makes it possible to reinterpret so in certain contexts as an indicator of contiguity. In these cases, so is semantically weakened, particularly in relation to its definiteness. The model is extended to also, from which als descends etymologically.
The third part of the article contains the semantic characterization of als in its variants as an intransitive and transitive connective. Als is described paradigmatically, in terms of the semantic oppositions that distinguish it from da, so, wie and wenn. Like so, it originally encodes similarity relations, but in present day German its use has been extended, so that it may indicate contiguity relations as well. With da and so it shares the abstract relational meaning O-S,R,E. The main difference from da is its lesser degree of definiteness; in contrast to so, its use is almost exclusively temporal. Wie and wenn are indefinites, i.e. they do not establish a deictic backlink to the speaker and discourse context. Als indicates that the situated event temporally overlaps with a specific event of reference, whose factivity is presupposed. The reference event must be categorically predictable in the context of utterance. Als does not indicate temporal antecedence of the reference event in relation to the speech event; it only requires the identifiability of the reference event and its non-coincidence with the speech event.
In the last section, so-called "peripheral temporal clauses" are examined with respect to the syntagmatic interaction between aspectuality, intonational focus, serialization of clauses and the abstract relational meaning of als. The proposed semantic formula is shown to be capable not only of clarifying the paradigmatic structure of a subset of German connectives but also of explaining the semantic and stylistic properties of complex sentences.
Mit diesem Papier sollen LexikografInnen an ein Automatisierungstool der Textanalyse innerhalb der Korpuslinguistik herangeführt werden. Das am IDS entwickelte statistische Recherche- und Analysewerkzeug Cosmas bietet neue Zugänge zur Gewinnung semantischer Informationen über Wörter. Die Nutzungsmöglichkeiten dieses Instrumentariums für die Lesartendisambiguierung von Lexemen und deren Verifizierung mittels Kollokations- und Kontextanalyse werden erläutert, und anhand des Beispiels cool wird gezeigt, inwieweit sich semantische Informationen durch automatische Statistik extrahieren lassen. Dabei wird auf die Vor- und Nachteile der computerbasierten Analyse eingegangen. Darüber hinaus wird dargestellt, wie empirische lexikografische Disambiguierung modellgeleitet validiert werden kann. Um die Unterschiede zwischen herkömmlichen Beschreibungsmöglichkeiten und neuen statistischen Verfahren zu verdeutlichen, werden die Lesarten zu cool, wie sie im Duden GWDS (2000) dargestellt sind, mit den identifizierten Lesarten der Analyse mit Cosmas verglichen.