400 Sprache, Linguistik
Refine
Year of publication
Document Type
- Conference Proceeding (27)
- Article (7)
- Part of a Book (4)
- Other (4)
Has Fulltext
- yes (42)
Keywords
- Datenmanagement (21)
- Forschungsdaten (21)
- Metadaten (20)
- Infrastruktur (14)
- Forschung (13)
- Computerlinguistik (12)
- CLARIN (8)
- Digital Humanities (5)
- Geisteswissenschaften (5)
- Sozialwissenschaften (5)
- CLARIN-D (4)
- Computerunterstützte Lexikografie (4)
- Forschungsinfrastruktur (4)
- Korpus <Linguistik> (4)
- Lexikon (4)
- Metadatenmodell (4)
- Repository <Informatik> (4)
- metadata (4)
- research infrastructure (4)
- Archivierung (3)
- Bibliografische Daten (3)
- Bibliothekskatalog (3)
- Component MetaData Infrastructure (CMDI) (3)
- Component Metadata Infrastructure (CMDI) (3)
- Lexikografie (3)
- Linked Data (3)
- Standardisierung (3)
- infrastructure (3)
- research data management (3)
- Annotation (2)
- Bibliothek (2)
- CMDI (2)
- Daten (2)
- Datenaufbereitung (2)
- Datenerhebung (2)
- Datenformat (2)
- Datenmodell (2)
- Datenqualität (2)
- Datenverarbeitung (2)
- Fallstudie (2)
- Forschungseinrichtung (2)
- Interoperabilität (2)
- NFDI (2)
- Nationale Forschungsdateninfrastruktur (NFDI) e.V. (2)
- Normung (2)
- Softwarewerkzeug (2)
- Sprachdaten (2)
- Terminologie (2)
- Text+ (2)
- Texttechnologie (2)
- Universitätsbibliothek (2)
- XML (2)
- bibliographic metadata (2)
- data migration (2)
- data repositories (2)
- humanities (2)
- research data (2)
- Automatische Sprachanalyse (1)
- CLARIAH (1)
- CLARIAH-DE (1)
- CLARIN infrastructure (1)
- CMDI experiences (1)
- CMDI infrastructure use (1)
- CMDI metadata (1)
- CMDI profile creation (1)
- CoRDI 2023 (1)
- Community-Hub (1)
- Component Metadata Description Infrastructure (1)
- Corporate Identity (1)
- DMPTY (1)
- Datenanalyse (1)
- Datensatz (1)
- Datenspeicherung (1)
- Datenstruktur (1)
- Digitalisierung (1)
- Dokumentenserver (1)
- Dublin Core (1)
- Editor (1)
- FSR (1)
- Forschungsmethode (1)
- Forschungsprozess (1)
- Geistes- und Sozialwissenschaften (1)
- Gemeinsame Normdatei (GND) (1)
- Gemeinschaft (1)
- Generalversammlung (1)
- Gesprochene Sprache (1)
- Graph (1)
- ISO-Norm (1)
- ISOcat (1)
- ISOcat registry (1)
- Identität (1)
- Interdisziplinarität (1)
- Interoperability (1)
- LMF (1)
- LR infrastructures and architectures (1)
- Language resources (1)
- Language technology (1)
- Langzeitarchierung (1)
- Langzeitarchivierung (1)
- Lebenslauf (1)
- Linguistik (1)
- Literaturwissenschaft (1)
- MARC 21 (1)
- META-SHARE (1)
- Maschinelles Lernen (1)
- Metadata Management (1)
- Mikrostruktur (1)
- NaLiDa (1)
- Nachhaltigkeit (1)
- Natürliche Sprache (1)
- Normdatei (1)
- Normdaten (1)
- Persistent identifier (1)
- Qualitätssicherung (1)
- RDF <Informatik> (1)
- RDM (1)
- Research infrastructure (1)
- Ressourcen (1)
- SOA (1)
- Schriftsprache (1)
- Semantic Interoperability (1)
- Semantik (1)
- Server (1)
- Serviceorientierte Architektur (1)
- Social Media (1)
- Social sciences and humanities (1)
- Sprachbasierte Forschung (1)
- Sprachbasierte Forschungsdaten (1)
- Sprachvariation (1)
- Sprachverarbeitung (1)
- TBX (1)
- Tabelle (1)
- Technische Infrastruktur (1)
- Technischer Fortschritt (1)
- Terminologie-Mappings (1)
- Text Mining (1)
- Textbasierte Forschungsdaten (1)
- Textlinguistik (1)
- Textplus NFDI (1)
- Virtual Language Observatory (VLO) (1)
- Vollversammlung (1)
- Web Services (1)
- WebLicht (1)
- Website (1)
- Wissensgraph (1)
- Wissensvermittlung (1)
- Wortschatz (1)
- XForms (1)
- XQuery (1)
- archiving support (1)
- archiving workflow (1)
- authority records (1)
- business data (1)
- business research (1)
- community engagement (1)
- concept scheme (1)
- concept system (1)
- conceptual domain (1)
- crosswalks (1)
- data category (1)
- dissemination (1)
- economic data (1)
- exploration of CMDI metadata (1)
- feature structure representation (1)
- general assembly (1)
- humanities and social sciences (1)
- language based research (1)
- lexical database (1)
- lexical markup framework (1)
- lexical resources (1)
- lexicon (1)
- lexicon graph (1)
- lexicon graphs (1)
- lexicon model (1)
- lexicon model formalism (1)
- lexicon structure (1)
- linked data (1)
- long-term archival (1)
- machine learning (1)
- markup framework (1)
- metadata editor (1)
- metadata formats (1)
- metadata quality (1)
- metadata quality assessment (1)
- metadata score (1)
- persistent identifiers (1)
- primary research data repository (1)
- quantitative quality metrics (1)
- relation registry (1)
- schema.org (1)
- semantic interoperability (1)
- standardization (1)
- standards for LRs (1)
- sustainable archives (1)
- term base exchange format (1)
- text (1)
- text analytics (1)
- use cases (1)
- user communities (1)
- virtual collections (1)
Publicationstate
- Veröffentlichungsversion (34)
- Zweitveröffentlichung (7)
- Postprint (6)
Reviewstate
- Peer-Review (34)
- (Verlags)-Lektorat (5)
Publisher
This contribution summarizes the lessons learned from the organization of a joint conference on text analytics research by the Business, Economic, and Related Data (BERD@NFDI) and Text+ consortia within the National Research Data Infrastructure (NFDI) in Germany. The collaboration aimed to identify common ground and foster interdisciplinary dialogue between scholars in the humanities and in the business domain. The lessons learned include the importance of presenting research questions using textual data to establish common ground, similarities in methodology for processing textual data between the consortia, similarities in research data management, and the need for regular interconsortial discussions on textual analysis methods and data. The collaboration proved valuable for interdisciplinary dialogue within the NFDI, and further collaboration between the consortia is planned.
"Reproducibility crisis" and "empirical turn" are only two keywords when it comes to providing reasons for research data management. Research data is omnipresent and with the more and more automatic data processing procedures, they become even more important. However, just because new methods require data and produce data, this does not mean that data are easily accessible, reusable or even make a difference in the CV of a researcher, even if a large portion of research goes into data creation, acquisition, preparation, and analysis. In this talk I will present where we find data in the research process, where we may find appropriate support for data management and advocate for a procedure for including it in research publications and resumes.
This presentation relies on work within the BMBF-funded project CLARIN-D. It also builds on work within the German National Research Data Infrastructure (NFDI) consortium Text+, DFG project number 460033370.
The CLARIN Concept Registry (CCR) is the common semantic ground for most CMDI-based profiles to describe language-related resources in the CLARIN universe. While the CCR supports semantic interoperability within this universe, it does not extend beyond it. The flexibility of CMDI, however, allows users to use other term or concept registries when defining their metadata components. In this paper, we describe our use of schema.org, a light ontology used by many parties across disciplines.
In dem auf die Forschungsdaten sprach- und textbasierter Disziplinen ausgerichteten NFDI-Konsortium Text+ spielen Normdaten eine zentrale Rolle für die interoperable Beschreibung und semantische Verknüpfung von verteilten Datenquellen. Insbesondere die Gemeinsame Normdatei (GND) ist ein bedeutender Hub im Zentrum eines im Entstehen begriffenen, domänenübergreifenden Wissensgraphen. Diese Funktion soll im Rahmen von Text+ durch den Aufbau einer GND-Agentur für sprach- und textbasierte Forschungsdaten weiterentwickelt und ausgebaut werden. Ziel ist es, niedrigschwellige, qualitätsgesicherte Beteiligungsmöglichkeiten für Forschende zu schaffen und zugleich den Vernetzungsgrad der GND auch durch Terminologie-Mappings zu erweitern. Spezifische Anforderungen und Nutzungspraktiken werden hierbei anhand der Datendomänen von Text+ exemplifziert.
This poster summarizes the results of the CLARIAH-DE Work Package 5 - Community Engagement: Outreach/Dissemination and Liaison.
Work package 5 engages with the community through dissemination activities, outreach and liaison. The work package set itself the following sub goals:
- Combining the existing dissemination and outreach activities of CLARIN-D and DARIAH-DE in a meaningful way and elaborating on them. In some cases this meant continuity, in other cases a new appearance for resources.
- Providing a web portal as a gateway to the CLARIAH-DE project.
- Creating a common identity and corporate identity and maintaining the established level of trust users already put into CLARIN-D and DARIAH-DE.
- Providing a social media presence as well as a physical presence at workshops, conferences and other meetings in the Digital Humanities.
The CLARIN infrastructure as an interoperable language technology platform for SSH and beyond
(2023)
CLARIN is a European Research Infrastructure Consortium developing and providing a federated and interoperable platform to support scientists in the field of the Social Sciences and Humanities in carrying-out language-related research. This contribution provides an overview of the entire infrastructure with a particular focus on tool interoperability, ease of access to research data, tools and services, the importance of sharing knowledge within and across (national) communities, and community building. By taking into account FAIR principles from the very beginning, CLARIN succeeded in becoming a successful example of a research infrastructure that is actively used by its members. The benefits CLARIN members reap from their infrastructure secure a future for their common good that is both sustainable and attractive to partners beyond the original target groups.
This chapter will present lessons learned from CLARIN-D, the German CLARIN national consortium. Members of the CLARIN-D communities and of the CLARIN-D consortium have been engaged in innovative, data-driven, and community-based research, using language resources and tools in the humanities and neigh-bouring disciplines. We will present different use cases and users’ stories that demonstrate the innovative research potential of large digital corpora and lexical resources for the study of language change and variation, for language documentation, for literary studies, and for the social sciences. We will emphasize the added value of making language resources and tools available in the CLARIN distributed research infrastructure and will discuss legal and ethical issues that need to be addressed in the use of such an infrastructure. Innovative technical solutions for accessing digital materials still under copyright and for data mining such materials will be presented. We will outline the need for close interaction with communities of interest in the areas of curriculum development, data management, and training the next generation of digital humanities scholars. The importance of community-supported standards for encoding language resources and the practice of community-based quality control for digital research data will be presented as a crucial step toward the provisioning of high quality research data. The chapter will conclude with a discussion of impor-tant directions for innovative research and for supporting infrastructure development over the next decade and beyond.
Lexical resources are often represented in table form, e. g., in relational databases, or represented in specially marked up texts, for example, in document based XML models. This paper describes how it is possible to model lexical structures as graphs and how this model can be used to exploit existing lexical resources and even how different types of lexical resources can be combined.
In this contribution we present some work of the R&D European project “LIRICS” and of the ISO/TC 37/SC 4 committee related to the topic of interoperability and re-use of language resources. We introduce some basic mechanisms of the standardization work in ISO and describe in more details the general approach on how to cope with the annotation of language data within ISO.
Lexicography
(2008)