Refine
Document Type
- Book (2) (remove)
Language
- English (2) (remove)
Has Fulltext
- yes (2)
Keywords
- Automatische Textanalyse (1)
- Corpus linguistics (1)
- Corpus management (1)
- Corpus technology (1)
- Datenmanagement (1)
- Deutsch (1)
- Internet (1)
- Korpus <Linguistik> (1)
- Online-Publikation (1)
- Texttechnologie (1)
Publicationstate
Reviewstate
- Peer-Review (1)
Publisher
- Institut für Deutsche Sprache (2) (remove)
Contents:
1. Andreas Dittrich: Intra-connecting a small exemplary literary corpus with semantic web technologies for exploratory literary studies, S. 1
2. John Kirk, Anna Čermáková: From ICE to ICC: The new International Comparable Corpus, S. 7
3. Dawn Knight, Tess Fitzpatrick, Steve Morris, Jeremy Evas, Paul Rayson, Irena Spasic, Mark Stonelake, Enlli Môn Thomas, Steven Neale, Jennifer Needs, Scott Piao, Mair Rees, Gareth Watkins, Laurence Anthony, Thomas Michael Cobb, Margaret Deuchar, Kevin Donnelly, Michael McCarthy, Kevin Scannell: Creating CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh), S. 13
4. Marc Kupietz, Andreas Witt, Piotr Bański, Dan Tufiş, Dan Cristea, Tamás Váradi: EuReCo - Joining Forces for a European Reference Corpus as a sustainable base for cross-linguistic research, S. 15
5. Harald Lüngen, Marc Kupietz: CMC Corpora in DeReKo, S. 20
6. David McClure, Mark Algee-Hewitt, Douris Steele, Erik Fredner, Hannah Walser: Organizing corpora at the Stanford Literary Lab, S. 25
7. Radoslav Rábara, Pavel Rychlý ,Ondřej Herman: Accelerating corpus search using multiple cores, S. 30
8. John Vidler, Stephen Wattam: Keeping Properties with the Data: CL-MetaHeaders – An Open Specification, S. 35
9. Vladimir Benko: Are Web Corpora Inferior? The Case of Czech and Slovak, S. 43
10. Edyta Jurkiewicz-Rohrbacher, Zrinka Kolaković, Björn Hansen: Web Corpora – the best possible solution for tracking phenomena in underresourced languages: clitics in Bosnian, Croatian and Serbian, S. 49
11. Vít Suchomel: Removing Spam from Web Corpora Through Supervised Learning Using FastText, S. 56
The evolution of computer technologies and the introduction of the World Wide Web (WWW) have substantially changed the way scientific articles and books are published today. Besides writing for "traditional" print media, more and more authors decide to reach a larger audience and to decrease distribution time by offering their works on the internet. The electronic medium not only facilitates the spread of information, it also adds new value by extending the possibilities of knowledge retrieval. Of course the same is true for structured data collections like scientific glossaries, dictionaries or bibliographies. They particularly profit from the web when being accessible via user-friendly and effective frontends. The following chapters deal with the transformation of the Bibliography of German Grammar (“Bibliografie zur deutschen Grammatik”) from a data pool primarly used for print publishing to a relational database application offering a basis for media-independent distribution. Starting with a short description of the beginnings of the bibliography, the focus of this article lies on the explanation of our current database design as well as on the presentation of the web-based user interface.