TY - CHAP U1 - Konferenzveröffentlichung A1 - Diewald, Nils A1 - Hanl, Michael A1 - Margaretha, Eliza A1 - Bingel, Joachim A1 - Kupietz, Marc A1 - Bański, Piotr A1 - Witt, Andreas ED - Calzolari, Nicoletta ED - Choukri, Khalid ED - Declerck, Thierry ED - Goggi, Sara ED - Grobelnik, Marko ED - Maegaard, Bente ED - Mariani, Joseph ED - Mazo, Helene ED - Moreno, Asuncion ED - Odijk, Jan ED - Piperidis, Stelios T1 - KorAP architecture – diving in the deep sea of corpus data T2 - Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia N2 - KorAP is a corpus search and analysis platform, developed at the Institute for the German Language (IDS). It supports very large corpora with multiple annotation layers, multiple query languages, and complex licensing scenarios. KorAP’s design aims to be scalable, flexible, and sustainable to serve the German Reference Corpus DEREKO for at least the next decade. To meet these requirements, we have adopted a highly modular microservice-based architecture. This paper outlines our approach: An architecture consisting of small components that are easy to extend, replace, and maintain. The components include a search backend, a user and corpus license management system, and a web-based user frontend. We also describe a general corpus query protocol used by all microservices for internal communications. KorAP is open source, licensed under BSD-2, and available on GitHub. KW - Korpusanalyseplattform (KorAP) KW - Institut für Deutsche Sprache KW - Textlinguistik KW - Korpus KW - microservices KW - large corpus data Y1 - 2016 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-50361 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-50361 SN - 978-2-9517408-9-1 SB - 978-2-9517408-9-1 SP - 3586 EP - 3591 PB - European Language Resources Association (ELRA) CY - Paris ER -