Refine
Year of publication
- 2008 (62) (remove)
Document Type
- Conference Proceeding (29)
- Part of a Book (16)
- Article (13)
- Doctoral Thesis (3)
- Working Paper (1)
Language
- English (62) (remove)
Is part of the Bibliography
- no (62)
Keywords
- Deutsch (15)
- Korpus <Linguistik> (9)
- Annotation (5)
- Automatische Sprachanalyse (4)
- Computerlinguistik (4)
- Computerunterstützte Lexikographie (4)
- Englisch (4)
- Gesprochene Sprache (4)
- Mehrsprachigkeit (4)
- Computerunterstützte Lexikografie (3)
Publicationstate
- Veröffentlichungsversion (33)
- Postprint (7)
- Zweitveröffentlichung (4)
Reviewstate
Publisher
- European Language Resources Association (ELRA) (7)
- de Gruyter (4)
- ELRA (3)
- University of Oulu (3)
- Academia (2)
- Benjamins (2)
- European Language Resources Association (2)
- Aisthesis (1)
- BBAW (1)
- CSLI (1)
In the context of a Nordic Conference on Bilingualism, it can be a rewarding task to look at issues such as language planning, policy and legislation from a perspective of the southern neighbours of the Nordic world. This paper therefore intends to point attention towards a case of societal multilingualism at the periphery of the Nordic world by dealing with recent developments in language policy and legislation with regard to the North Frisian speech community in the German Land of Schleswig-Holstein. As I will show, it is striking to what degree there are considerable differences in the discourse on minority protection and language legislation between the Nordic countries and a cultural area which may arguably be considered to be part of the Nordic fringe - and which itself occasionally takes Scandinavia as a reference point, e.g. in the recent adoption of a pan-Frisian flag modelled on the Nordic cross (Falkena 2006).
The main focus of the paper will be on the Frisian Act which was passed in the Parliament of Schleswig-Holstein in late 2004. It provides a certain legal basis for some political activities with regard to Frisian, but falls short of creating a true spirit of minority language protection and/or revitalisation. In contrast to the traditions of the German and Danish minorities along the German-Danish border and to minority protection in Northern Scandinavia (in particular to Sámi language rights), the approach chosen in the Frisian Act is extremely weak and has no connotation of long-term oriented language-planning, let alone a rights-based perspective.
The paper will then look at policy developments in the time since the Act was passed, e.g. in the Schleswig-Holstein election campaign in 2005, and on latest perceptions of the Frisian language situation in the discourse on North Frisian Policy in Schleswig-Holstein majority society. In the final part of the paper, I will discuss reasons for the differences in minority language policy discourse between Germany and the Nordic countries, and try to provide an outlook on how Frisian could benefit from its geographic proximity to the Nordic world.
Current Natural Language Processing (NLP) systems feature high-complexity processing pipelines that require the use of components at different levels of linguistic and application specific processing. These components often have to interface with external e.g. machine learning and information retrieval libraries as well as tools for human annotation and visualization. At the UKP Lab, we are working on the Darmstadt Knowledge Processing Software Repository (DKPro) (Gurevych et al., 2007a; Müller et al., 2008) to create a highly flexible, scalable and easy-to-use toolkit that allows rapid creation of complex NLP pipelines for semantic information processing on demand. The DKPro repository consists of several main parts created to serve the purposes of different NLP application areas
In this paper we investigate the coverage of the two knowledge sources WordNet and Wikipedia for the task of bridging resolution. We report on an annotation experiment which yielded pairs of bridging anaphors and their antecedents in spoken multi-party dialog. Manual inspection of the two knowledge sources showed that, with some interesting exceptions, Wikipedia is superior to WordNet when it comes to the coverage of information necessary to resolve the bridging anaphors in our data set. We further describe a simple procedure for the automatic extraction of the required knowledge from Wikipedia by means of an API, and discuss some of the implications of the procedure’s performance.
The thesis describes a fully automatic system for the resolution of the pronouns 'it', 'this', and 'that' in English unrestricted multi-party dialog. Referential relations considered include both normal NP-antecedence as well as discourse-deictic pronouns. The thesis contains a theoretical part with a comprehensive empiricial study, and a practical part describing machine learning experiments.
In this paper, we present a suite of flexible UIMA-based components for information retrieval research which have been successfully used (and re-used) in several projects in different application domains. Implementing the whole system as UIMA components is beneficial for configuration management, component reuse, implementation costs, analysis and visualization.
Lexicography
(2008)
Lexicon schemas and their use are discussed in this paper from the perspective of lexicographers and field linguists. A variety of lexicon schemas have been developed, with goals ranging from computational lexicography (DATR) through archiving (LIFT, TEI) to standardization (LMF, FSR). A number of requirements for lexicon schemas are given. The lexicon schemas are introduced and compared to each other in terms of conversion and usability for this particular user group, using a common lexicon entry and providing examples for each schema under consideration. The formats are assessed and the final recommendation is given for the potential users, namely to request standard compliance from the developers of the tools used. This paper should foster a discussion between authors of standards, lexicographers and field linguists.
Our research task consists in the study of the way in which multilingual resources are mobilized in team work within collaborative activities; how they are exploited in a specific way in order both to enhance collaboration and to respect the specificities of the members’ linguistic competences and practices within the team. Central to our analytical work, which is inspired by ethnomethodological conversation analysis, is the relationship between multilingual resources and the situated organization of linguistic uses and of social practices. These two aspects are reflexively articulated, multilingual resources being shaped by the very contexts of their use and activities being constrained and thus structured by the available resources.
Although there is a growing interest of policy makers in higher education issues (especially on an international scale), there is still a lack of theoretically well-grounded comparative analyses of higher education policy. Even broadly discussed topics in higher education research like the potential convergence of European higher education systems in the course of the Bologna Process suffer from a thin empirical and comparative basis. This paper aims to deal with these problems by addressing theoretical questions concerning the domestic impact of the Bologna Process and the role national factors play in determining its effects on cross-national policy convergence. It develops a distinct theoretical approach for the systematic and comparative analysis of cross-national policy convergence. In doing so, it relies upon insights from related research areas — namely literature on Europeanization as well as studies dealing with cross-national policy convergence.
One problem of data-driven answer extraction in open-domain factoid question answering is that the class distribution of labeled training data is fairly imbalanced. In an ordinary training set, there are far more incorrect answers than correct answers. The class-imbalance is, thus, inherent to the classification task. It has a deteriorating effect on the performance of classifiers trained by standard machine learning algorithms. They usually have a heavy bias towards the majority class, i.e. the class which occurs most often in the training set. In this paper, we propose a method to tackle class imbalance by applying some form of cost-sensitive learning which is preferable to sampling. We present a simple but effective way of estimating the misclassification costs on the basis of class distribution. This approach offers three benefits. Firstly, it maintains the distribution of the classes of the labeled training data. Secondly, this form of meta-learning can be applied to a wide range of common learning algorithms. Thirdly, this approach can be easily implemented with the help of state-of-the-art machine learning software.