Refine
Year of publication
Document Type
- Conference Proceeding (19) (remove)
Has Fulltext
- yes (19)
Is part of the Bibliography
- no (19)
Keywords
- Computerlinguistik (6)
- Korpus <Linguistik> (4)
- Bibliografie (3)
- Diskursanalyse (3)
- Gefangenenliteratur (3)
- GeoBib (3)
- Geoinformationssystem (3)
- Nationalsozialistische Verbrechen (3)
- Visualisierung (3)
- Automatische Spracherkennung (2)
Publicationstate
Reviewstate
- (Verlags)-Lektorat (9)
- Peer-Review (3)
Publisher
- Gesellschaft für Informatik e.V. (3)
- ICCC Press (2)
- Universität Hamburg (2)
- ACM (1)
- DGPF e.V. (1)
- EPFL/UNIL (1)
- FOSSGIS e.V. (1)
- Foi-Commerce (1)
- Institut für Kognitionswissenschaft Universität Osnabrück (1)
- Ljubljana University Press (1)
The present paper reports the first results of the compilation and annotation of a blog corpus for German. The main aim of the project is the representation of the blog discourse structure and relations between its elements (blog posts, comments) and participants (bloggers, commentators). The data included in the corpus were manually collected from the scientific blog portal SciLogs. The feature catalogue for the corpus annotation includes three types of information which is directly or indirectly provided in the blog or can be construed by means of statistical analysis or computational tools. At this point, only directly available information (e.g. title of the blog post, name of the blogger etc.) has been annotated. We believe, our blog corpus can be of interest for the general study of blog structure or related research questions as well as for the development of NLP methods and techniques (e.g. for authorship detection).
Dieser Artikel gibt einen Einblick in das GeoBib-Projekt und die Problematik der Verwendung von historischen Karten und der daraus abgeleiteten Geodaten in einem WebGIS. Das GeoBib-Projekt hat zum Ziel, eine annotierte und georeferenzierte Online-Bibliographie der frühen deutsch- bzw. polnischsprachigen Holocaust- und Lagerliteratur von 1933 bis 1949 bereitzustellen. Zu diesem Zeitraum werden historische Karten und Geodaten gesammelt, aufbereitet und im zugehörigen WebGIS des GeoBib-Portals visualisiert. Eine Besonderheit ist die aufwendige Recherche von Geodaten und Kartenmaterial für den Zeitraum zwischen 1933 und 1949. Die Problematiken bezüglich der Recherche und späteren Visualisierung historischer Geodaten und des Kartenmaterials sind ein Hauptaugenmerk in diesem Artikel. Weiterhin werden Konzepte für die Visualisierung von historischem, unvollständigem Kartenmaterial präsentiert und ein möglicher Lösungsweg für die bestehenden Herausforderungen aufgezeigt.
From Open Source to Open Information. Collaborative Methods in Creating XML-based Markup Languages
(2000)
Knowledge in textual form is always presented as visually and hierarchically structured units of text, which is particularly true in the case of academic texts. One research hypothesis of the ongoing project Knowledge ordering in texts - text structure and structure visualisations as sources of natural ontologies1 is that the textual structure of academic texts effectively mirrors essential parts of the knowledge structure that is built up in the text. The structuring of a modern dissertation thesis (e.g. in the form of an automatically generated table of contents - toes), for example, represents a compromise between requirements of the text type and the methodological and conceptual structure of its subject-matter. The aim of the project is to examine how visual-hierarchical structuring systems are constructed, how knowledge structures are encoded in them, and how they can be exploited to automatically derive ontological knowledge for navigation, archiving, or search tasks. The idea to extract domain concepts and semantic relations mainly from the structural and linguistic information gathered from tables of contents represents a novel approach to ontology learning.
A text parsing component designed to be part of a system that assists students in academic reading an writing is presented. The parser can automatically add a relational discourse structure annotation to a scientific article that a user wants to explore. The discourse structure employed is defined in an XML format and is based the Rhetorical Structure Theory. The architecture of the parser comprises pre-processing components which provide an input text with XML annotations on different linguistic and structural layers. In the first version these are syntactic tagging, lexical discourse marker tagging, logical document structure, and segmentation into elementary discourse segments. The algorithm is based on the shift-reduce parser by Marcu (2000) and is controlled by reduce operations that are constrained by linguistic conditions derived from an XML-encoded discourse marker lexicon. The constraints are formulated over multiple annotation layers of the same text.