Refine
Year of publication
- 2014 (64) (remove)
Document Type
- Article (43)
- Conference Proceeding (13)
- Part of a Book (5)
- Book (2)
- Part of Periodical (1)
Keywords
- Deutsch (24)
- Computerlinguistik (7)
- Korpus <Linguistik> (6)
- Syntax (6)
- Wörterbuch (5)
- Englisch (4)
- Gesprochene Sprache (4)
- Information Extraction (4)
- Natürliche Sprache (4)
- Soziale Wahrnehmung (4)
Publicationstate
- Veröffentlichungsversion (34)
- Zweitveröffentlichung (11)
- Postprint (7)
Reviewstate
- Peer-Review (64) (remove)
Publisher
- De Gruyter (4)
- Erich Schmidt Verlag (3)
- Sage (3)
- Schmidt (3)
- Universitätsverlag Hildesheim (3)
- EURAC Research (2)
- Nodus (2)
- de Gruyter (2)
- Association for Computational Linguistics (1)
- Benjamins (1)
Automatic Food Categorization from Large Unlabeled Corpora and Its Impact on Relation Extraction
(2014)
We present a weakly-supervised induction method to assign semantic information to food items. We consider two tasks of categorizations being food-type classification and the distinction of whether a food item is composite or not. The categorizations are induced by a graph-based algorithm applied on a large unlabeled domain-specific corpus. We show that the usage of a domain-specific corpus is vital. We do not only outperform a manually designed open-domain ontology but also prove the usefulness of these categorizations in relation extraction, outperforming state-of-the-art features that include syntactic information and Brown clustering.
We examine the task of separating types from brands in the food domain. Framing the problem as a ranking task, we convert simple textual features extracted from a domain-specific corpus into a ranker without the need of labeled training data. Such method should rank brands (e.g. sprite) higher than types (e.g. lemonade). Apart from that, we also exploit knowledge induced by semi-supervised graph-based clustering for two different purposes. On the one hand, we produce an auxiliary categorization of food items according to the Food Guide Pyramid, and assume that a food item is a type when it belongs to a category unlikely to contain brands. On the other hand, we directly model the task of brand detection using seeds provided by the output of the textual ranking features. We also harness Wikipedia articles as an additional knowledge source.
We report on the two systems we built for Task 1 of the German Sentiment Analysis Shared Task, the task on Source, Subjective Expression and Target Extraction from Political Speeches (STEPS). The first system is a rule-based system relying on a predicate lexicon specifying extraction rules for verbs, nouns and adjectives, while the second is a translation-based system that has been obtained with the help of the (English) MPQA corpus.
Eine syntaktische Besonderheit der kontinentalwestgermanischen Sprachen ist die Bildung satzfinaler Verbalkomplexe (" ... dass sie das Buch gelesen haben muss"), für die ein hohes Maß an sprach- bzw. dialektübergreifender und idiolektaler Verbstellungsvariation charakteristisch ist. Der niederdeutsche Verbalkomplex gilt in Überblicksdarstellungen als streng kopffinal, wobei bisher – anders als für niederländische und hochdeutsche (besonders: oberdeutsche) Mundarten – kaum empirische Studien vorliegen. Der Aufsatz präsentiert eine deskriptive Analyse des zweigliedrigen Verbalkomplexes im Märkisch-Brandenburgischen, dem südöstlichsten der niederdeutschen Dialektverbände.
Im Gegensatz zum Standarddeutschen und anderen niederdeutschen Mundarten wie dem Nordniederdeutschen, weist das Brandenburgische selbst bei nur zwei verbalen Elementen in der rechten Satzklammer Variation auf ("dass sie lesen kann/kann lesen"). Anhand von Tonaufnahmen aus dem bisher kaum erschlossenen DDR-Korpus wird folgenden Fragen nachgegangen: Welche Verbstellungsvarianten sind in welchen Syntagmen möglich bzw. werden präferiert? Welche Unterschiede bestehen zwischen Haupt- und Nebensatzkomplexen? Wie verhält sich der brandenburgische Verbalkomplex in Bezug auf nicht-verbale Intervenierer (sog. Verb Projection Raising)? Wie verhalten sich Modal- und andere infinitivregierende Verben unter Perfekteinbettung (d.h. in stddt. Ersatzinfinitivkontexten)?
Am Ende steht eine erste typologische Einordnung des brandenburgischen Verbalkomplexes im Vergleich mit anderen kontinentalwestgermanischen Varietäten, wobei sich areallinguistisch interessante Ähnlichkeiten mit dem südlich angrenzenden Ostmitteldeutschen zeigen.
Growing globalisation of the world draws attention to cultural differences between people from different countries or from different cultures within the countries. Notwithstanding the diversity of people’s worldviews, current cross-cultural research still faces the challenge of how to avoid ethnocentrism; comparing Western-driven phenomena with like variables across countries without checking their conceptual equivalence clearly is highly problematic. In the present article we argue that simple comparison of measurements (in the quantitative domain) or of semantic interpretations (in the qualitative domain) across cultures easily leads to inadequate results. Questionnaire items or text produced in interviews or via open-ended questions have culturally laden meanings and cannot be mapped onto the same semantic metric. We call the culture-specific space and relationship between variables or meanings a ’cultural metric’, that is a set of notions that are inter-related and that mutually specify each other’s meaning. We illustrate the problems and their possible solutions with examples from quantitative and qualitative research. The suggested methods allow to respect the semantic space of notions in cultures and language groups and the resulting similarities or differences between cultures can be better understood and interpreted.
Large classes at universities(> 1600 students) create their own challenges for teaching and learning. Audience feedback is lacking and fine tuning of lectures, courses and exam preparation to address individual needs is very difficult to achieve. At RWTH Aachen University, a course concept and a knowledge map learning tool aimed to support individual students to prepare for exams in information science through theme-based exercises were developed and evaluated. The tool was grounded in the notion of self-regul ated learning with the goal of enabling students to learn
independently.
Once a new word or a new meaning is added to a monolingual dictionary, the lexicographer is to provide a definition of this item. This paper focuses on the methodological challenges in writing such definitions. After a short discussion of the central terminology (method and definition), the article describes factors which inform this process: linguistic theories, linguistic and lexicographical methods, and types of definitions. Using the example of elexiko, a dictionary project of the Institute for the German language (IDS) in Mannheim, Germany, the paper finally showcases the compilation of definitions in a monolingual online dictionary of contemporary German.
Seit Jahrzehnten fordern zahlreiche Metalexikografen und Lexikografen immer wieder eine umfangreichere Beschäftigung mit Wörterbüchern im muttersprachlichen Deutschunterricht, auch in der gymnasialen Oberstufe. Trotzdem spielen die Wortschatzarbeit und der Umgang mit Wörterbüchern in Lehrplänen, Didaktiken und Lehrwerken in den meisten Fällen allenfalls eine marginale Rolle. Im Anschluss an eine überblicksartige Bestandsaufnahme dazu untersucht der vorliegende Beitrag, inwieweit elexiko, ein Onlinewörterbuch zur deutschen Gegenwartssprache, sinnvoll in den muttersprachlichen Deutschunterricht der Sekundarstufe II integriert werden könnte. Am Beispiel des Angabebereichs der Bedeutungserläuterung wird überprüft, ob Schüler der gymnasialen Oberstufe als Zielgruppe für elexiko infrage kommen und für welche linguistischen Themen sich die Wortschatzarbeit mit den semantischen Paraphrasen für elexiko anbietet.
Measuring the quality of metadata is only possible by assessing the quality of the underlying schema and the metadata instance. We propose some factors that are measurable automatically for metadata according to the CMD framework, taking into account the variability of schemas that can be defined in this framework. The factors include among others the number of elements, the (re-)use of reusable components, the number of filled in elements. The resulting score can serve as an indicator of the overall quality of the CMD instance, used for feedback to metadata providers or to provide an overview of the overall quality of metadata within a repository. The score is independent of specific schemas and generalizable. An overall assessment of harvested metadata is provided in form of statistical summaries and the distribution, based on a corpus of harvested metadata. The score is implemented in XQuery and can be used in tools, editors and repositories.
Dieser Beitrag zeigt, wie allgemeinsprachige Wörterbücher mit Angaben zur Sinn- und Sachverwandtschaft umgehen sollten, damit sie als geeignetes Hilfsmittel bei der Wortschatzarbeit sowohl im muttersprachlichen als auch im fremdsprachlichen Unterricht eingesetzt werden können. Anhand einiger Beispiele aus dem elexiko-Wörterbuch sollen Möglichkeiten aufgezeigt werden, wie kombinierte lexikalisch-semantische Informationen einen Beitrag zur gezielten Wortschatzerweiterung leisten könnten. Für eine effektive Verankerung sprachlichen und außersprachlichen Wissens sollten Erkenntnisse über das Mentale Lexikon in die Darstellung und Beschreibung von Sprache im Wörterbuch eingebunden werden. Konkrete Vorschläge illustrieren, wie Nachschlagewerke möglicherweise gestaltet werden sollten, um besser als Lehrwerke und Quellen für die Wortschatzarbeit geeignet zu sein. Dafür ist es erforderlich, dass die Dokumentation sprachlicher Zusammenhänge auf unterschiedlichen Ebenen, die angemessene Visualisierung kontextueller Phänomene und explizite Erläuterungen eine entscheidende Rolle spielen
By way of migration, large numbers of German-speaking settlers arrived in Pennsylvania between roughly 1700 and 1750. Pennsylvania German, as a distinct variety, developed through levelling processes from L1 varieties of these migrants who came mainly from the southwestern regions of the German speaking area. Pennsylvania German is still spoken today by specific religious groups (primarily Amish and Menonnite groups) for many of whom it is an identity marker. My paper focuses on those Pennsylvania Germans who are not part of these religious groups but have the same migration history. Due to their being closer to the cultural values of American mainstream society, they were integrated into it, and during the 20th century their use of Pennsylvania German was continually diminishing. A revival of this heritage language has occurred over the past c. three decades, including language courses offered at community colleges, public libraries, etc., where ethnic Pennsylvania Germans wish to (re-)learn the language of their grandparents. Written Pennsylvania German data from four points in time between the 1860s and the 1990s were analysed in this study. Based on these linguistic analyses, differences between the data sets are shown that point towards a diachronic change in the language contact situation of Pennsylvania German speakers. Sociolinguistic and extralinguistic factors are considered that influence the role of PG and make their speakers heritage speakers much in the sense of recent immigrant heritage speakers, although delayed by 200 years.
Previous accounts addressing the question what semantic properties of a matrix predicate determine the possible clause type of the embedded clause have not provided a general answer (e.g. Grimshaw 1979, Zifonun et al. 1997, Ginzburg & Sag 2000). This paper proposes that clause-embedding predicates fulfill characteristic logical conditions, so-called consistency conditions, which rule the syntactic potential of the matrix clause: for instance, the clause type of the embedded clause (declarative, ob- and/or wh-interrogative) and the correlate type, the matrix predicate can co-occur with (es and/or ProPP). Furthermore, they predict the logical forms of legitimate constructions with embedded ob- or wh-interrogatives, respectively, and how a legitimate optional correlate modifies the meaning of the matrix predicate.
Previous accounts addressing the question what semantic properties of a matrix predicate determine the possible clause type of the embedded clause have not provided a general answer (e.g. Grimshaw 1979, Zifonun et al. 1997, Ginzburg & Sag 2000). This paper proposes that clause-embedding predicates fulfill characteristic logical conditions, so-called consistency conditions, which rule the syntactic potential of the matrix clause: for instance, the clause type of the embedded clause (declarative, ob- and/or wh-interrogative) and the correlate type, the matrix predicate can co-occur with (es and/or ProPP). Furthermore, they predict the logical forms of legitimate constructions with embedded ob- or wh-interrogatives, respectively, and how a legitimate optional correlate modifies the meaning of the matrix predicate.
Communication of stereotypes in the classroom: biased language use of German and Turkish adolescents
(2014)
Little is known about the linguistic transmission and maintenance of mutual stereotypes in interethnic contexts. This field study, therefore, investigated the linguistic expectancy bias (LEB) and the linguistic intergroup bias (LIB) among German and Turkish adolescents (13 to 20 years) in the school context. The LEB refers to the general phenomenon of describing stereotypes more abstractly. The LIB is the tendency to use language abstraction for in-group protective reasons. Results revealed an unmoderated LEB, whereas the LIB only occurred when foreigners were in the numerical majority, the classroom composition was perceived as a learning disadvantage, or the interethnic conflict frequency was high. These findings provide first evidence for the use of both LEB and LIB in an interethnic classroom setting.
Bezeichnungen für Personen, die sich nicht in ihrem Heimatland aufhalten (z.B. Migrant, Ausländer, Flüchtling) werden in der Sprachgemeinschaft häufig wertend und kontrovers verwendet. In dem Beitrag wird gezeigt, dass die allgemeinsprachige Lexikografie diesen Aspekt bislang nicht angemessen berücksichtigt – weder in der korpusgestützten, methodischen Erfassung und Analyse von Sprachdaten noch in der beschreibenden Darstellung. Am Beispiel von elexiko werden Ansätze vorgestellt, die das Potenzial besitzen, dieses Desiderat einzulösen.
Dieser Beitrag stellt das Forschungs- und Lehrkorpus Gesprochenes Deutsch (FOLK) und die Datenbank für Gesprochenes Deutsch (DGD) als Instrumente gesprächsanalytischer Arbeit vor. Nach einer allgemeinen Einführung in FOLK und DGD im zweiten Abschnitt werden im dritten Abschnitt die methodischen Beziehungen zwischen Korpuslinguistik und Gesprächsforschung und die Herausforde-rungen, die sich bei der Begegnung dieser beiden Herangehensweisen an authenti-sches Sprachmaterial stellen, kurz skizziert. Der vierte Abschnitt illustriert dann ausgehend vom Beispiel der Formel ich sag mal, wie eine korpus- und datenbankgesteuerte Analyse zur Untersuchung von Gesprächsphänomenen beitragen kann.
Dieser Artikel gibt einen Einblick in das GeoBib-Projekt und die Problematik der Verwendung von historischen Karten und der daraus abgeleiteten Geodaten in einem WebGIS. Das GeoBib-Projekt hat zum Ziel, eine annotierte und georeferenzierte Online-Bibliographie der frühen deutsch- bzw. polnischsprachigen Holocaust- und Lagerliteratur von 1933 bis 1949 bereitzustellen. Zu diesem Zeitraum werden historische Karten und Geodaten gesammelt, aufbereitet und im zugehörigen WebGIS des GeoBib-Portals visualisiert. Eine Besonderheit ist die aufwendige Recherche von Geodaten und Kartenmaterial für den Zeitraum zwischen 1933 und 1949. Die Problematiken bezüglich der Recherche und späteren Visualisierung historischer Geodaten und des Kartenmaterials sind ein Hauptaugenmerk in diesem Artikel. Weiterhin werden Konzepte für die Visualisierung von historischem, unvollständigem Kartenmaterial präsentiert und ein möglicher Lösungsweg für die bestehenden Herausforderungen aufgezeigt.
This paper presents challenges and opportunities resulting from the application of geographical information systems (GIS) in the (digital) humanities. First, we provide an overview of the intersection and interaction between geography (and cartography), and the humanities. Second, the “GeoBib” project is used as a case study to exemplify challenges for such collaborative, interdisciplinary projects, both for the humanists and the geoscientists. Finally, we conclude with an outlook on further applications of GIS in the humanities, and the potential scientific benefit for both sides, humanities and geosciences.
Accurate opinion mining requires the exact identification of the source and target of an opinion. To evaluate diverse tools, the research community relies on the existence of a gold standard corpus covering this need. Since such a corpus is currently not available for German, the Interest Group on German Sentiment Analysis decided to create such a resource and make it available to the research community in the context of a shared task. In this paper, we describe the selection of textual sources, development of annotation guidelines, and first evaluation results in the creation of a gold standard corpus for the German language.
We present the German Sentiment Analysis Shared Task (GESTALT) which consists of two main tasks: Source, Subjective Expression and Target Extraction from Political Speeches (STEPS) and Subjective Phrase and Aspect Extraction from Product Reviews (StAR). Both tasks focused on fine-grained sentiment analysis, extracting aspects and targets with their associated subjective expressions in the German language. STEPS focused on political discussions from a corpus of speeches in the Swiss parliament. StAR fostered the analysis of product reviews as they are available from the website Amazon.de. Each shared task led to one participating submission, providing baselines for future editions of this task and highlighting specific challenges. The shared task homepage can be found at https://sites.google.com/site/iggsasharedtask/.
In recent years, new developments in the area of lexicography have altered not only the management, processing and publishing of lexicographical data, but also created new types of products such as electronic dictionaries and thesauri. These expand th range of possible uses of lexical data and support users with more flexibility, for instance in assisting human translation. In this article, we give a short and easy-to-understand introduction to the problematic nature of the storage, display and interpretation of lexical data. We then describe the main methods and specifications used to build and represent lexical data.
The annotation of parts of speech (POS) in linguistically annotated corpora is a fundamental annotation layer which provides the basis for further syntactic analyses, and many NLP tools rely on POS information as input. However, most POS annotation schemes have been developed with written (newspaper) text in mind and thus do not carry over well to text from other domains and genres. Recent discussions have concentrated on the shortcomings of present POS annotation schemes with regard to their applicability to data from domains other than newspaper text.
h ach KOMM; hör AUF mit dem klEInkram. Die Partikel komm zwischen Interjektion und Diskursmarker
(2014)
Der vorliegende Beitrag beschreibt das Formen-, Funktions- und Bedeutungsspek-trum der Partikel komm im gesprochenen Deutsch. Die Untersuchung zeigt, dass sich alle Verwendungen auf eine gemeinsame Grundfunktion zurückführen lassen, die als 'Aufforderung zum Aktivitätswechsel mit Appell an den common ground' bezeichnet wird. Es wird gezeigt, dass sich weitere, in der Literatur häufig der Partikel selbst zugeschriebene Bedeutungsbestandteile aus dem syntaktischen und sequenziellen Kontext ergeben. Verschiedene Kontexte lassen verschiedene Aspekte des Aktivitätswechsels salient erscheinen, so dass die Aufforderung ent-weder den Beginn einer neuen Handlung oder das Beenden einer vorausgehenden Aktivität fokussiert. Außerdem wird diskutiert, welcher Subklasse der Diskurspartikeln sich komm zuordnen lässt. Es zeigt sich, dass sowohl Merkmale von Dis-kursmarkern als auch von Interjektionen vorliegen, dass die Partikel aber auch von den prototypischen Vertretern beider Kategorien abweichende Merkmale zeigt, so dass vorgeschlagen wird, auf eine Klassifikation unterhalb der Ebene der Diskurspartikel zu verzichten, solange nicht weitere von Imperativen abgeleitete Partikeln (z.B. warte, sag mal) empirisch untersucht sind, mit denen komm möglicherweise eine eigene Subklasse bildet.
Alors que de nombreuses études en analyse conversationnelle se sont intéressées à la manière dont des locuteurs co-construisent un tour de parole (notamment sur le plan syntaxique et prosodique), la façon dont la co-construction est ensuite évaluée n'a pas encore été étudiée en profondeur au sein de la littérature interactionniste. Ici, nous étudions deux pratiques permettant à un locuteur de valider une co-construction, à savoir l'acquiescement simple et l'hétéro-répétition de la complétion. En menant une analyse séquentielle et multimodale de plusieurs séquences de co-construction en français, nous montrons qu’à travers ces deux procédés – qui semblent au premier abord similaires dans leur fonctionnement – les locuteurs effectuent une évaluation très différente : tandis que l'acquiescement simple valide la complétion proposée uniquement comme une version possible, l'hétéro-répétition la valide comme étant une complétion complètement adéquate. Cette contribution met en évidence que les interactants exploitent des ressources audibles aussi bien que visibles afin de manifester si et dans quel sens ils acceptent la complétion de leur tour de parole de la part d’un coparticipant. Nous soulignons l’importance d’étudier en détail les différents formatages possibles des tours évaluant une complétion afin de pouvoir distinguer différentes formes « d’acceptation » et de révéler la manière dont les locuteurs peuvent finement négocier leur position en tant que (co-)auteur ou destinataire d’un tour de parole.
The methods utilized in the area of research into dictionary use are established research methods in the social sciences. After explicating the different steps of a typical empirical investigation, this article provides examples of how these different methods are used in various user studies conducted in the field of using online dictionaries. Thereby, different kinds of data collection (surveys as online questionnaires, log files and eye tracking) as well as different research design structures (for instance, ex-post-facto design or experimental design) are discussed.
Dieser Beitrag geht der Frage nach, wie elexiko als eine Grundlage für Wortschatzübungen im Deutsch als Fremdsprache (bzw. Zweitsprache) Unterricht genutzt werden kann. Ausgegangen wird dabei davon, dass die explizite Wortschatzarbeit im Rahmen von Sprachunterricht, besonders gepaart mit einer gelungen vermittelten sprachbezogenen Landeskunde, das Verstehen der Sprache und die Fähigkeit zur erfolgreichen Kommunikation fördert. Dies setzt voraus, dass Deutschlehrende mit relevantem Sprachmaterial arbeiten, das sich möglichst eng am authentischen Sprachgebrauch orientiert und kulturelles Wissen mit transportiert. Hier bieten korpusgestützt erarbeitete Wörterbücher eine nützliche Quelle. Am Beispiel der im Wörterbuch aufgeführten Kollokationen wird skizziert, wie die Angaben aus diesem Bereich von Deutschlehrenden gewinnbringend für die Erarbeitung von Wortschatzübungen genutzt werden könnten.
Self-Regulated Learning (SRL) is a term that can be used to describe an individual’s ability to develop a skill set allowing him or her to learn in a number of different ways. SRL can also relate to new pedagogical theories that encourage teachers in formal education to motivate and support their students into achieving a high level of self-regulation. This paper reports on the findings of a number of surveys conducted with a wide variety of teachers in different countries, regarding their perceptions of SRL. The results and analysis of these surveys help inform not only the perceptions of SRL amongst teachers but also examine the challenges and opportunities that arise from taking this approach.
This paper seeks to apply the principles of the famous 3-Circle-Model devised for the description of the ecolinguistic position of English world-wide to the position of German around the world.
On the one hand, the 3-Circle-Model for English with its "Inner", "Outer" and "Extended/Expanding" Circles was invented by Kachru in the 1980s and has since then been adopted, refined and criticised by numerous authors. The situation of German world-wide, on the other hand, has only been scarcely discussed in the past 20 years. While the global extension of German is obviously by far weaker than that of English, there are also a number of noteworthy similarities in terms of historical spread and the current position of these two languages.
This paper therefore discusses the analogies of global English and German by establishing three circles for German: the Inner Circle for the core German-speaking area, i.e. Germany, Austria and Switzerland; the Outer Circle including a number of German minority areas (mostly in Europe), and finally the Extended Circle which may be denoted as "Crumbling" rather than "Expanding". The latter comprises traditional German diaspora communities in different parts of the world which either result from migration, but also reflect the previous functions of German as a language of culture and as a lingua franca in regions like Eastern Europe. The paper argues that there are some striking structural similarities, but also shows the limits of this comparison.
We investigate how the granularity of POS tags influences POS tagging, and furthermore, how POS tagging performance relates to parsing results. For this, we use the standard “pipeline” approach, in which a parser builds its output on previously tagged input. The experiments are performed on two German treebanks, using three POS tagsets of different granularity, and six different POS taggers, together with the Berkeley parser. Our findings show that less granularity of the POS tagset leads to better tagging results. However, both too coarse-grained and too fine-grained distinctions on POS level decrease parsing performance.
Schreiben nach Engelbart
(2014)
Douglas Engelbart hat 1968 mit seinem On-Line System das erste Mal gezeigt, wie ein Computer als interaktives Schreibwerkzeug genutzt werden kann. Der Beitrag zeichnet diese Urszene der Textverarbeitung nach, beschreibt die wesentlichen Entwicklungslinien, die das digitale Schreiben seitdem genommen hat, und erläutert die zentralen Konzepte, die es zunehmend prägen: Hybridität, Multimedialität und Sozialität.
Der folgende Artikel ist ein bearbeiteter Auszug aus Henning Lobins “Engelbarts Traum. Wie der Computer uns Lesen und Schreiben abnimmt” Frankfurt am Main / New York: Campus, 2014.
“My Curiosity was Satisfied, but not in a Good Way”: Predicting User Ratings for Online Recipes
(2014)
In this paper, we develop an approach to automatically predict user ratings for recipes at Epicurious.com, based on the recipes’ reviews. We investigate two distributional methods for feature selection, Information Gain and Bi-Normal Separation; we also compare distributionally selected features to linguistically motivated features and two types of frameworks: a one-layer system where we aggregate all reviews and predict the rating vs. a two-layer system where ratings of individual reviews are predicted and then aggregated. We obtain our best results by using the two-layer architecture, in combination with 5 000 features selected by Information Gain. This setup reaches an overall accuracy of 65.60%, given an upper bound of 82.57%.
Badania nad postrzeganiem społecznym wskazują, że osoby uśmiechające się są na licznych wymiarach postrzegane korzystniej aniżeli osoby nieuśmiechające się. Jednakże w niniejszych badaniach twierdzimy, że ta zależność nie zawsze jest pozytywna ponieważ postrzeganie uśmiechu może być zależne od kultury i takich jej wymiarów jak indywidualizm-kolektywizm czy asertywność. Eksperyment przeprowadzony w sześciu krajach (w Polsce, Niemczech, Norwegii, Iranie, USA oraz RPA) pokazał, że osoby uśmiechające się mogą być w kulturach kolektywistycznych i mało asertywnych postrzegane mniej korzystnie od osób nieuśmiechających się. W Niemczech osoby uśmiechnięte zostały ocenione jako bardziej inteligentne, a w Iranie jako mniej inteligentne niż osoby nieuśmiechnięte. Ponadto we wszystkich krajach poza Iranem osoby uśmiechnięte były postrzegane jako bardziej szczere niż osoby nieuśmiechnięte. Dyskutujemy stwierdzone efekty w kontekście zróżnicowania kultur opisanego przez Housea i zespół (2004) oraz przez Hofstedego (2001).
Studies on social perception reveal that on many dimensions, smiling individuals are perceived more positively in comparison with non-smiling individuals. The experiment carried out in seven countries (China, Germany, Iran, Norway, Poland, USA, and the Republic of South Africa) showed that in some cultures, smiling individuals may be perceived less favorably than non-smiling individuals. We compared ratings of intelligence made by participants viewing photos of smiling and non-smiling people. The results showed that smiling individuals were perceived as more intelligent in Germany and in China; smiling individuals were perceived as less intelligent than the (same) non-smiling individuals in Iran. We suggest that the obtained effects can be explained by the cultural diversity within the dimension of uncertainty avoidance described in the GLOBE (Global Leadership and Organizational Behavior Effectiveness) project by House, Hanges, Javidan, Dorfman, and Gupta.
Vorwort
(2014)
By evaluating two corpora containing linguistic data on spoken standard language usage (with a total of 770 speakers), the current range of variation of lexical stress in loanwords will be analyzed. In doing so, the focus will be on the age and background of the speakers to be able to document processes of linguistic change and regionalisms. Regarding the phenomenon studied here, it becomes apparent that more detailed and multicausal separate analyses are required to interpret the results conclusively in spite of an overall trend that was at irst convincing (and that would support the theoretical assumptions concerning the loanwordʼs age and the source language inluencing the rate of assimilation). The results of the individual analyses contradict the assumed “overall trend”. One of the corpora was collected by experienced ield workers, while the other was collected by students. By comparing both corpora, some light can be shed onto the question as to what extent “undirected” and less rigidly collected data can support or complement more extensive and costly research projects.
Prejudice against a social group may lead to discrimination of members of this group. One very strong cue of group membership is a (non)standard accent in speech. Surprisingly, hardly any interventions against accent-based discrimination have been tested. In the current article, we introduce an intervention in which what participants experience themselves unobtrusively changes their evaluations of others. In the present experiment, participants in the experimental condition talked to a confederate in a foreign language before the experiment, whereas those in the control condition received no treatment. Replicating previous research, participants in the control condition discriminated against Turkish-accented job candidates. In contrast, those in the experimental condition evaluated Turkish- and standard-accented candidates as similarly competent. We discuss potential mediating and moderating factors of this effect.
Die vorliegende empirische Untersuchung befasst sich mit einer Umfrage zur Wörterbuchbenutzung bei 41 Studentinnen und Studenten des Dipartimento di Filologia, Letteratura e Linguistica der Universität Pisa, dasselbe Department, an dem auch das deutsch-italienische sprachwissenschaftliche Online-Wörterbuch DIL erarbeitet worden ist (vgl. Flinz: 2011). Die schriftliche Umfrage wurde in Anlehnung an Hartmanns 5. Hypothese „An analysis of users´ needs should precede dictionary design“ (1989) durchgeführt. Die wichtigsten Ergebnisse waren von großer Bedeutung für die Gestaltung der makro- und mikrostrukturellen Eigenschaften des Fachwörterbuches. Die Ergebnisse der Untersuchung und die daraus folgenden Reflektionen werden in thematischen Kernblöcken vorgestellt.
We present a technique called event mapping that allows to project text representations into event lists, produce an event table, and derive quantitative conclusions to compare the text representations. The main application of the technique is the case where two classes of text representations have been collected in two different settings (e.g., as annotations in two different formal frameworks) and we can compare the two classes with respect to their systematic differences in the event table. We illustrate how the technique works by applying it to data collected in two experiments (one using annotations in Vladimir Propp’s framework, the other using natural language summaries).
We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it can be reliably trained for simple tales.
So far, there have been few descriptions on creating structures capable of storing lexicographic data, ISO 24613:2008 being one of the latest. Another one is by Spohr (2012), who designs a multifunctional lexical resource which is able to store data of different types of dictionaries in a user-oriented way. Technically, his design is based on the principle of a hierarchical XML/OWL (eXtensible Markup Language/Web Ontology Language) representation model. This article follows another route in describing a model based on entities and relations between them; MySQL (usually referred to as: Structured Query Language) describes a database system of tables containing data and definitions of relations between them. The model was developed in the context of the project "Scientific eLexicography for Africa" and the lexicographic database to be built thereof will be implemented with MySQL. The principles of the ISO model and of Spohr's model are adhered to with one major difference in the implementation strategy: we do not place the lemma in the centre of attention, but the sense description — all other elements, including the lemma, depend on the sense description. This article also describes the contained lexicographic data sets and how they have been collected from different sources. As our aim is to compile several prototypical internet dictionaries (a monolingual Northern Sotho dictionary, a bilingual learners' Xhosa–English dictionary and a bilingual Zulu–English dictionary), we describe the necessary microstructural elements for each of them and which principles we adhere to when designing different ways of accessing them. We plan to make the model and the (empty) database with all graphical user interfaces that have been developed, freely available by mid-2015.
This paper describes a first version of an integrated e-dictionary translating possessive constructions from English to Zulu. Zulu possessive constructions are difficult to learn for non-mother tongue speakers. When translating from English into Zulu, a speaker needs to be acquainted with the nominal classification of nouns indicating possession and possessor. Furthermore, (s)he needs to be informed about the morpho-syntactic rules associated with certain combinations of noun classes. Lastly, knowledge of morpho-phonetic changes is also required, because these influence the orthography of the output word forms. Our approach is a novel one in that we combine e-lexicography and natural language processing by developing a (web) interface supporting learners, as well as other users of the dictionary to produce Zulu possessive constructions. The final dictionary that we intend to develop will contain several thousand nouns which users can combine as they wish. It will also translate single words and frequently used multiword expressions, and allow users to test their own translations. On request, information about the morpho-syntactic and morpho-phonetic rules applied by the system are displayed together with the translation. Our approach follows the function theory: the dictionary supports users in text production, at the same time fulfilling a cognitive function.