TY - CHAP U1 - Konferenzveröffentlichung A1 - Brunner, Annelen A1 - Steyer, Kathrin T1 - Corpus-driven study of multi-word expressions based on collocations from a very large corpus T2 - Proceedings of the 4th Corpus Linguistics conference, Birmingham N2 - We present a corpus-driven approach to the study of multi-word expressions, which constitute a significant part of. As a data basis, we use collocation profiles computed from DeReKo (Deutsches Referenzkorpus), the largest available collection of written German which has approximately two billion word tokens and is located at the Institute for the German Language (IDS). We employ a strongly usage-based approach to multi-word expressions, which we think of as conventionalised patterns in language use that manifest themselves in recurrent syntagmatic patterns of words. They are defined by their distinct function in language. To find multi-word expressions, we allow ourselves to be guided by corpus data and statistical evidence as much as possible, making interpretative steps carefully and in a monitored fashion. We develop a procedure of interpretation that leads us from the evidence of collocation profiles to a collection of recurrent word patterns and finally to multi-word expressions. When building up a collection of multi-word expressions in this fashion, it becomes clear that the expressions can be defined on different levels of generalisation and are interrelated in various ways. This will be reflected in the documentation and presentation of the findings. We are planning to add annotation in a way that allows grouping the multi-word expressions according to different features and to add links between them to reflect their relationships, thus constructing a network of multi-word expressions. KW - Deutsch KW - Kollokation KW - Korpus KW - Sprachstatistik Y1 - 2007 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-41414 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-41414 SP - 12 S1 - 12 PB - University of Birmingham CY - Birmingham ER -