K-Dutch: Difference between revisions

From Clarin K-Centre
Jump to navigation Jump to search
No edit summary
Line 47: Line 47:
*[http://hdl.handle.net/10032/tm-a2-s7 MedPilot website]
*[http://hdl.handle.net/10032/tm-a2-s7 MedPilot website]


===[[Spelling and grammar]]===
==Spelling==
 
===Dutch Word List (‘Groene Boekje’)===
 
The Dutch Language Institute is an authority in the field of spelling. Since 1995, the institute (then: Institute for Dutch Lexicology) has been compiling the ‘Word List of the Dutch Language’ for the Dutch Language Union: an overview of the official spelling of Dutch words. In its printed form, the Word List is better known as het Groene Boekje (‘the green book’). The most recent edition was released in 2005.
 
===Woordenlijst.org===
 
The Word List of the Dutch Language is online available for free at woordenlijst.org. In 2015, the online version grew from approximately 100,000 entries to roughly 168,000 entries. All words from the previous printed edition have been retained.
 
The newly added words are derived from text files collected at the Dutch Language Institute, containing newspaper texts, literary texts and texts from the internet. In addition, a selection was made from all words that had been looked up in vain in the online Word List.
 
Since 2015, woordenlijst.org has been updated several times a year with hundreds of new words. At the end of 2019 it contained a total of 186,000 words. With all plural forms, diminutive forms, past tenses and past participles, the digital version of the Word List now contains information about approximately 680,000 word forms.
 
*[https://woordenlijst.org Online version]
 
===Spelling Certification Mark===
 
The Spelling Certification Mark (Keurmerk Spelling) is a guarantee given by the Union for the Dutch Language (Nederlandse Taalunie) that a reference work can be used to look up the official spelling.
 
For the automatic spell check of word lists (for example provided by dictionary suppliers), the Dutch Language Institute uses the Spelling Certification Mark, also known as the HulK. Our spelling specialists manually correct the words the HulK does not recognize and add these to our own material. From then on the words can be processed automatically.
 
Any word list compiled in accordance with the rules and principles of the official spelling receives the Spelling Certification Mark.
 
===Spelling tools===
 
*[https://dev.clarin.nl/node/1914 TiCCLops]: Text-Induced Corpus Clean-up online processing system
 
==Grammar==
 
===e-ANS: Dutch grammar===
 
The General Dutch Grammar, or ANS (Algemene Nederlandse Spraakkunst), is the go-to reference grammar for the Dutch language. It is the most extensive description of the grammatical aspects of contemporary Dutch. Its target users are both native speakers and foreign speakers learning Dutch. The ANS was born out of a Belgian-Dutch cooperation and was first printed in 1984. The second and revised 1997 edition was digitized, resulting in the e-ANS.
 
Lately, the Dutch Language Institute (INT) has been working on a new, user-friendly website for the ANS, while work was started on the revision of its contents by the Leiden University Center for Linguistics (LUCL), Ghent University, KU Leuven and Radboud University Nijmegen.
 
From 2020 onwards, the further revision of the contents will also be coordinated by the INT. The first revised chapters of the General Dutch Grammar will appear online in 2020, describing prepositions, word order and negations, among other subjects.
 
===Taalportaal===
 
Taalportaal (or Language Portal) collects the existing information on the grammars of Dutch, Frisian and Afrikaans and makes this information easily accessible in a scientifically sound way. Such a language portal is unique in the world. Three core domains traditionally distinguished in grammar – phonology, morphology and syntax – have been integrated into one portal, using extensive cross referencing to ensure optimal linking. This offers interesting opportunities for linguists to discover relations and connections between linguistic phenomena that have remained hidden until now.
 
The Taalportaal website is in English, enabling researchers who are not proficient in Dutch, Frisian or Afrikaans to study these languages.


==[[CLARIN resource families]]: corpora, and lexica for Dutch==
==[[CLARIN resource families]]: corpora, and lexica for Dutch==

Revision as of 12:08, 23 March 2021

Mediawiki:Mainpage

Welcome to K-DUTCH, the place for anyone who wants to know anything about the Dutch language: linguistic properties, language advice, available tools and resources, etymology, dialects...

K-Dutch is (will be soon) a CLARIN Knowledge Centre. It is hosted by the Instituut voor de Nederlandse Taal (Dutch Language Institute) , which is also a CLARIN-B centre and host of many resources for Dutch, which are, in general, freely available for research purposes.


Linguisitic topics

Phonology, Morphology and Syntax: Taalportaal

Many aspects of Dutch linguistics are described in the Taalportaal website

Taalportaal (or Language Portal) is an interactive knowledge base about Dutch, Frisian and Afrikaans. It provides access to a comprehensive and authoritative scientific grammar for these three languages. Up to now there has been no comprehensive scientifically-based description of the grammars of Dutch, Frisian and Afrikaans. This is a serious shortcoming, considering that

  • language is seen as an important part of cultural identity and cultural heritage
  • a large number of people learn these languages as a second language
  • educated speakers frequently lack grammatical knowledge of their native language
  • Dutch and Afrikaans an important object of study in linguistic theory and related fields of research

Taalportaal fills this gap by providing a thorough description of the phonology, morphology and syntax of the three languages.

Lexicography

Morphosyntax

Syntactic Atlas of the Dutch dialects (SAND)

The Dynamic Syntactic Atlas of the Dutch dialects (DynaSAND) is an on-line tool for dialect syntax research. DynaSAND consists of a database, a search engine, a cartographic component and a bibliography.

Terminology

The Centre of Expertise for Dutch Terminology (Expertisecentrum Nederlandstalige Terminologie or ENT) supports people and organisations involved with terminology. They can find terminological information and tools here, on the website of the Dutch Language Institute (INT). A newsletter is sent round several times a year, describing developments and events in the field of terminology.

Higher Education Terminology

HOTNeV is an acronym for Hoger Onderwijs Terminologie in Nederland en Vlaanderen (Higher Education Terminology in the Netherlands and Flanders). This project was prompted by a sharp increase in educational terms, generated by the EU’s education policy and implemented by the Tuning Project. HOTNeV has a dual purpose. Until now, Dutch equivalents for the English terminology were created mainly ad hoc, but this project focuses on the need to coordinate the provision of terms that have been approved by parties in the Dutch-speaking educational sector. It also wants to show the feasibility of this ambition.

Medical Terminology

The Medical Pilot is an experimental database in which a small part of the medical vocabulary is described at various levels, from scientific to accessible to people with low literacy, and in which differences between Flemish and Dutch terms are also shown.

Spelling

Dutch Word List (‘Groene Boekje’)

The Dutch Language Institute is an authority in the field of spelling. Since 1995, the institute (then: Institute for Dutch Lexicology) has been compiling the ‘Word List of the Dutch Language’ for the Dutch Language Union: an overview of the official spelling of Dutch words. In its printed form, the Word List is better known as het Groene Boekje (‘the green book’). The most recent edition was released in 2005.

Woordenlijst.org

The Word List of the Dutch Language is online available for free at woordenlijst.org. In 2015, the online version grew from approximately 100,000 entries to roughly 168,000 entries. All words from the previous printed edition have been retained.

The newly added words are derived from text files collected at the Dutch Language Institute, containing newspaper texts, literary texts and texts from the internet. In addition, a selection was made from all words that had been looked up in vain in the online Word List.

Since 2015, woordenlijst.org has been updated several times a year with hundreds of new words. At the end of 2019 it contained a total of 186,000 words. With all plural forms, diminutive forms, past tenses and past participles, the digital version of the Word List now contains information about approximately 680,000 word forms.

Spelling Certification Mark

The Spelling Certification Mark (Keurmerk Spelling) is a guarantee given by the Union for the Dutch Language (Nederlandse Taalunie) that a reference work can be used to look up the official spelling.

For the automatic spell check of word lists (for example provided by dictionary suppliers), the Dutch Language Institute uses the Spelling Certification Mark, also known as the HulK. Our spelling specialists manually correct the words the HulK does not recognize and add these to our own material. From then on the words can be processed automatically.

Any word list compiled in accordance with the rules and principles of the official spelling receives the Spelling Certification Mark.

Spelling tools

  • TiCCLops: Text-Induced Corpus Clean-up online processing system

Grammar

e-ANS: Dutch grammar

The General Dutch Grammar, or ANS (Algemene Nederlandse Spraakkunst), is the go-to reference grammar for the Dutch language. It is the most extensive description of the grammatical aspects of contemporary Dutch. Its target users are both native speakers and foreign speakers learning Dutch. The ANS was born out of a Belgian-Dutch cooperation and was first printed in 1984. The second and revised 1997 edition was digitized, resulting in the e-ANS.

Lately, the Dutch Language Institute (INT) has been working on a new, user-friendly website for the ANS, while work was started on the revision of its contents by the Leiden University Center for Linguistics (LUCL), Ghent University, KU Leuven and Radboud University Nijmegen.

From 2020 onwards, the further revision of the contents will also be coordinated by the INT. The first revised chapters of the General Dutch Grammar will appear online in 2020, describing prepositions, word order and negations, among other subjects.

Taalportaal

Taalportaal (or Language Portal) collects the existing information on the grammars of Dutch, Frisian and Afrikaans and makes this information easily accessible in a scientifically sound way. Such a language portal is unique in the world. Three core domains traditionally distinguished in grammar – phonology, morphology and syntax – have been integrated into one portal, using extensive cross referencing to ensure optimal linking. This offers interesting opportunities for linguists to discover relations and connections between linguistic phenomena that have remained hidden until now.

The Taalportaal website is in English, enabling researchers who are not proficient in Dutch, Frisian or Afrikaans to study these languages.

CLARIN resource families: corpora, and lexica for Dutch

Language Processing tools for Dutch

Types of services offered

We will store answers to questions we receive in this wiki, which will grow into a repository of K-Dutch answers to your questions.