Refine
Document Type
- Part of a Book (2)
- Conference Proceeding (2)
- Working Paper (1)
Has Fulltext
- yes (5) (remove)
Keywords
- Institut für Deutsche Sprache <Mannheim> (5) (remove)
Publicationstate
Reviewstate
Publisher
This paper describes the efforts in the field of sustainability of the Institut für Deutsche Sprache (IDS) in Mannheim with respect to DEREKO (Deutsches Referenzkorpus) the Archive of General Reference Corpora of Contemporary Written German. With focus on re-usability and sustainability, we discuss its history and our future plans. We describe legal challenges related to the creation of a large and sustainable resource; sketch out the pipeline used to convert raw texts to the final corpus format and outline migration plans to TEI P5. Due to the fact, that the current version of the corpus management and query system is pushed towards its limits, we discuss the requirements for a new version which will be able to handle current and future DEREKO releases. Furthermore, we outline the institute’s plans in the field of digital preservation.
The present article describes the first stage of the KorAP project, launched recently at the Institut für Deutsche Sprache (IDS) in Mannheim, Germany. The aim of this project is to develop an innovative corpus analysis platform to tackle the increasing demands of modern linguistic research. The platform will facilitate new linguistic findings by making it possible to manage and analyse primary data and annotations in the petabyte range, while at the same time allowing an undistorted view of the primary linguistic data, and thus fully satisfying the demands of a scientific tool. An additional important aim of the project is to make corpus data as openly accessible as possible in light of unavoidable legal restrictions, for instance through support for distributed virtual corpora, user-defined annotations and adaptable user interfaces, as well as interfaces and sandboxes for user-supplied analysis applications. We discuss our motivation for undertaking this endeavour and the challenges that face it. Next, we outline our software implementation plan and describe development to-date.
This paper describes the effort of the Institut für Deutsche Sprache (IDS), the central research institution for the German language, connected with Information and Communications Technology (ICT). Use of ICT in a language research institute is twofold. On the one hand, ICT provides basic services for researches to accomplish their daily work. On the other hand, several national and international institutions have a strong interest in ICT. Therefore, ICT can also be seen as an amplifier for language research. The first part of this paper reports on the activates of the IDS in internal and external ICT-related projects and initiatives. The second part describes a general strategy towards an ICT strategy that could be useful both for the IDS and other national language institutes. We think such a general strategy is necessary to create a strong foundation not only for the ICT-related projects, but as a basis for a modem research institute.