Refine
Year of publication
Document Type
- Conference Proceeding (19) (remove)
Has Fulltext
- yes (19)
Is part of the Bibliography
- no (19)
Keywords
- Computerlinguistik (6)
- Korpus <Linguistik> (4)
- Bibliografie (3)
- Diskursanalyse (3)
- Gefangenenliteratur (3)
- GeoBib (3)
- Geoinformationssystem (3)
- Nationalsozialistische Verbrechen (3)
- Visualisierung (3)
- Automatische Spracherkennung (2)
Publicationstate
Reviewstate
- (Verlags)-Lektorat (9)
- Peer-Review (3)
Publisher
- Gesellschaft für Informatik e.V. (3)
- ICCC Press (2)
- Universität Hamburg (2)
- ACM (1)
- DGPF e.V. (1)
- EPFL/UNIL (1)
- FOSSGIS e.V. (1)
- Foi-Commerce (1)
- Institut für Kognitionswissenschaft Universität Osnabrück (1)
- Ljubljana University Press (1)
Uncertain about Uncertainty: Different ways of processing fuzziness in digital humanities data
(2014)
The GeoBib project is constructing a georeferenced online bibliography of early Holocaust and camp literature published between 1933 and 1949 (Entrup et al. 2013a). Our immediate objectives include identifying the texts of interest in the first place, composing abstracts for them, researching their history, and annotating relevant places and times. Relations between persons, texts, and places will be visualized using digital maps and GIS software as an integral part of the resulting GeoBib information portal. The combination of diverse data from varying sources not only enriches our knowledge of these otherwise mostly forgotten texts; it also confronts us with vague, uncertain or even conflicting information. This situation yields challenges for all researchers involved – historians, literary scholars, geographers and computer scientists alike. While the project operates at the intersection of historical and literary studies, the involved computer scientists are in charge of providing a working environment (Entrup et al. 2013b) and processing the collected information in a way that is formalized yet capable of dealing with inevitable vagueness, uncertainty and contradictions. In this paper we focus on the problems and opportunities of encoding and processing fuzzy data.
A text parsing component designed to be part of a system that assists students in academic reading an writing is presented. The parser can automatically add a relational discourse structure annotation to a scientific article that a user wants to explore. The discourse structure employed is defined in an XML format and is based the Rhetorical Structure Theory. The architecture of the parser comprises pre-processing components which provide an input text with XML annotations on different linguistic and structural layers. In the first version these are syntactic tagging, lexical discourse marker tagging, logical document structure, and segmentation into elementary discourse segments. The algorithm is based on the shift-reduce parser by Marcu (2000) and is controlled by reduce operations that are constrained by linguistic conditions derived from an XML-encoded discourse marker lexicon. The constraints are formulated over multiple annotation layers of the same text.
The administration of electronic publication in the Information Era congregates old and new problems, especially those related with Information Retrieval and Automatic Knowledge Extraction. This article presents an Information Retrieval System that uses Natural Language Processing and Ontology to index collection’s texts. We describe a system that constructs a domain specific ontology, starting from the syntactic and semantic analyses of the texts that compose the collection. First the texts are tokenized, then a robust syntactic analysis is made, subsequently the semantic analysis is accomplished in conformity with a metalanguage of knowledge representation, based on a basic ontology composed of 47 classes. The ontology, automatically extracted, generates richer domain specific knowledge. It propitiates, through its semantic net, the right conditions for the user to find with larger efficiency and agility the terms adapted for the consultation to the texts. A prototype of this system was built and used for the indexation of a collection of 221 electronic texts of Information Science written in Portuguese from Brazil. Instead of being based in statistical theories, we propose a robust Information Retrieval System that uses cognitive theories, allowing a larger efficiency in the answer to the users queries.
Für koordinative Konstrukte sind verschiedene syntaktische Grundstrukturen vorgeschlagen worden. Allen diesen Ansätzen ist gemein, daß sie die inkre- mentelle Verarbeitung dieser Konstruktionen nicht plausibel erklären können, obwohl Indizien dafür vorliegen, daß es sich bei Koordination keineswegs um ein genuin strukturelles Phänomen handelt, sondern um eines, daß aus den Prinzipien der inkrementellen Verarbeitung emergiert. Das skizzierte Verarbeitungsmodell basiert deshalb auf der Annahme, daß syntaktische Strukturen im Falle der Koordination mehrfach benutzt werden und hinsichtlich verschiedener sog. Projektionen zu verarbeiten sind. Diese Annahme erlaubt es, die Vielfalt der bei der Koordination auftretenden Tilgungs- und Reduktionsphänomene auf die Realisation koordinativer Strukturen bezüglich ihrer verschiedenen Projektionen zurückzuführen.