K-Dutch: Difference between revisions
No edit summary Tag: Reverted |
Tag: Rollback |
||
Line 1: | Line 1: | ||
<span style="color:white">Mediawiki:Mainpage</span><br> | <span style="color:white">Mediawiki:Mainpage</span><br> | ||
[[File:K-centre-logo.jpg|frameless|right]] | [[File:K-centre-logo.jpg|frameless|right]] | ||
Welcome to [[K-Dutch]], the place for anyone who wants to know anything about the Dutch language: linguistic properties, language advice, available tools and resources, etymology, dialects... | Welcome to [[K-Dutch]], the place for anyone who wants to know anything about the Dutch language: linguistic properties, language advice, available tools and resources, etymology, dialects... | ||
K-Dutch is a [https://www.clarin.eu/content/knowledge-centres CLARIN Knowledge Centre]. It is hosted by the [https://www.ivdnt.org Instituut voor de Nederlandse Taal] (Dutch Language Institute) , which is also a [https://www.clarin.eu/content/certified-centres CLARIN-B centre] and host of many resources for Dutch, which are, in general, freely available for research purposes. K-Dutch is an initiative of [https://www.clarin.eu CLARIN-ERIC] and [https://clarin-be.ivdnt.org CLARIN-BE]. | K-Dutch is a [https://www.clarin.eu/content/knowledge-centres CLARIN Knowledge Centre]. It is hosted by the [https://www.ivdnt.org Instituut voor de Nederlandse Taal] (Dutch Language Institute) , which is also a [https://www.clarin.eu/content/certified-centres CLARIN-B centre] and host of many resources for Dutch, which are, in general, freely available for research purposes. K-Dutch is an initiative of [https://www.clarin.eu CLARIN-ERIC] and [https://clarin-be.ivdnt.org CLARIN-BE]. | ||
The status of Dutch with respect to language technologies is described in | The status of Dutch with respect to language technologies is described in | ||
* Short version: [https://link.springer.com/chapter/10.1007/978-3-031-28819-7_12https://link.springer.com/chapter/10.1007/978-3-031-28819-7_12 Steurs, Vandeghinste and Daelemans (2023).] Language Report Dutch. In : Rehm, G., Way, A. (eds) ''European Language Equality''. Cognitive Technologies. Springer, Cham. <nowiki>https://doi.org/10.1007/978-3-031-28819-7_12</nowiki> | * Short version: [https://link.springer.com/chapter/10.1007/978-3-031-28819-7_12https://link.springer.com/chapter/10.1007/978-3-031-28819-7_12 Steurs, Vandeghinste and Daelemans (2023).] Language Report Dutch. In : Rehm, G., Way, A. (eds) ''European Language Equality''. Cognitive Technologies. Springer, Cham. <nowiki>https://doi.org/10.1007/978-3-031-28819-7_12</nowiki> | ||
* Longer version: [https://european-language-equality.eu/wp-content/uploads/2022/03/ELE___Deliverable_D1_10__Language_Report_Dutch_.pdf Steurs, Vandeghinste and Daelemans (2022). Report on Dutch.] Project deliverable. European Language Equality. | * Longer version: [https://european-language-equality.eu/wp-content/uploads/2022/03/ELE___Deliverable_D1_10__Language_Report_Dutch_.pdf Steurs, Vandeghinste and Daelemans (2022). Report on Dutch.] Project deliverable. European Language Equality. | ||
You are most welcome to contribute to these pages, please contact [mailto://servicedesk@ivdnt.org servicedesk@ivdnt.org] with as subject K-Dutch, and we will be in touch. | You are most welcome to contribute to these pages, please contact [mailto://servicedesk@ivdnt.org servicedesk@ivdnt.org] with as subject K-Dutch, and we will be in touch. | ||
==Linguisitic topics== | ==Linguisitic topics== | ||
===[[Grammar]]=== | ===[[Grammar]]=== | ||
* [[Grammar#Phonology,_Morphology_and_Syntax:_Taalportaal|Phonology, Morphology and Syntax: Taalportaal]] | * [[Grammar#Phonology,_Morphology_and_Syntax:_Taalportaal|Phonology, Morphology and Syntax: Taalportaal]] | ||
* [[Grammar#Morphosyntax|Morphosyntax]] | * [[Grammar#Morphosyntax|Morphosyntax]] | ||
Line 31: | Line 22: | ||
* [[Grammar#Grambank|Grambank]] | * [[Grammar#Grambank|Grambank]] | ||
===[[Lexicography]]=== | ===[[Lexicography]]=== | ||
* [[Lexicography#Dutch_dictionaries|Dutch dictionaries]] | * [[Lexicography#Dutch_dictionaries|Dutch dictionaries]] | ||
* [[Lexicography#Elexis|The Elexis Project]] | * [[Lexicography#Elexis|The Elexis Project]] | ||
* [https://ivdnt.org/wp-content/uploads/2021/02/The-Future-of-Academic-Lexicography-A-White-Paper.pdf White paper]: The Future of Academic Lexicography | * [https://ivdnt.org/wp-content/uploads/2021/02/The-Future-of-Academic-Lexicography-A-White-Paper.pdf White paper]: The Future of Academic Lexicography | ||
===[[Terminology]]=== | ===[[Terminology]]=== | ||
*[[Terminology#Centre_of_Expertise_for_Dutch_Terminology|The Centre of Expertise for Dutch Terminology]] | *[[Terminology#Centre_of_Expertise_for_Dutch_Terminology|The Centre of Expertise for Dutch Terminology]] | ||
Line 46: | Line 35: | ||
*[[Terminology#Legal_Terminology|Legal Terminology]] | *[[Terminology#Legal_Terminology|Legal Terminology]] | ||
===[[Spelling]]=== | ===[[Spelling]]=== | ||
*[[Spelling#Woordenlijst.org_(Official_Dutch_Word_List)|Woordenlijst.org (Official Dutch Word List)]] | *[[Spelling#Woordenlijst.org_(Official_Dutch_Word_List)|Woordenlijst.org (Official Dutch Word List)]] | ||
*[[Spelling#Spelling_Certification_Mark|Spelling Certification Mark]] | *[[Spelling#Spelling_Certification_Mark|Spelling Certification Mark]] | ||
==Linguistic resources: datasets== | ==Linguistic resources: datasets== | ||
===[[Corpora]]=== | ===[[Corpora]]=== | ||
===[[ | * [[Newspaper corpora]]: corpora exclusively consisting of newspaper text | ||
* [[Parliamentary corpora]] | |||
* [[Computer-mediated communication corpora]] | |||
* [[Corpora of academic texts]] | |||
* [[Historical corpora]] | |||
* [[L2 learner corpora]] | |||
* [[Manually annotated corpora]] | |||
* [[Multimodal corpora]] | |||
* [[Parallel corpora]] | |||
* [[Reference corpora]] | |||
* [[Social media corpora]] | |||
* [[Spoken corpora]] | |||
* [[Sign Language corpora]] | |||
* [[Propbanks]]: contains semantic role labels | |||
* [[Treebanks]] | |||
* [[Other corpora]] | |||
===Lexical Resources=== | |||
* [[Lexica]] | |||
* [[Dictionaries]] | |||
* [[Conceptual Resources]] | |||
* [[Wordlists]] | |||
* [[Embeddings]] | |||
* [[Lexica of terminology]] | |||
* [[Ontologies]] | |||
===N-grams=== | ===N-grams=== | ||
* [[Character N-grams]] | * [[Character N-grams]] | ||
==Tools for Dutch== | ==Tools for Dutch== | ||
===Normalisation=== | ===Normalisation=== | ||
Line 66: | Line 77: | ||
*[https://lt3.ugent.be/normalisation-demo/ Normalisation Demo] | *[https://lt3.ugent.be/normalisation-demo/ Normalisation Demo] | ||
===Language Learning=== | ===Language Learning=== | ||
*[https://schrijfassistent.be Schrijfassistent] | *[https://schrijfassistent.be Schrijfassistent] | ||
Line 76: | Line 86: | ||
*[https://www.taalwinkel.nl/ Taalwinkel]: Language Advice | *[https://www.taalwinkel.nl/ Taalwinkel]: Language Advice | ||
===Automatic linguistic annotation=== | ===Automatic linguistic annotation=== | ||
* [[Basic language processing]] | * [[Basic language processing]] | ||
Line 84: | Line 93: | ||
<!--* Text mining!--> | <!--* Text mining!--> | ||
===Speech processing=== | ===Speech processing=== | ||
* [[Spoken Language Recognition]] | * [[Spoken Language Recognition]] | ||
Line 90: | Line 98: | ||
* Speech synthesis | * Speech synthesis | ||
===Natural Language Processing=== | ===Natural Language Processing=== | ||
* [[Language Modeling]] | * [[Language Modeling]] | ||
Line 102: | Line 109: | ||
* [[Clinical NLP]] | * [[Clinical NLP]] | ||
===Resource querying=== | ===Resource querying=== | ||
* [[Corpus querying]] | * [[Corpus querying]] | ||
* [[Treebank querying]] | * [[Treebank querying]] | ||
===Machine translation=== | ===Machine translation=== | ||
====Translation Engines==== | ====Translation Engines==== | ||
Line 118: | Line 123: | ||
*[https://mateo.ivdnt.org/Translate MATEO No Language Left Behind] | *[https://mateo.ivdnt.org/Translate MATEO No Language Left Behind] | ||
====MT Evaluation==== | ====MT Evaluation==== | ||
*[https://mateo.ivdnt.org/Evaluate MATEO Machine Translation Evaluation Online] | *[https://mateo.ivdnt.org/Evaluate MATEO Machine Translation Evaluation Online] | ||
===Terminology extraction=== | ===Terminology extraction=== | ||
* [https://termtreffer.org/ Termtreffer]. Ask for login at [mailto:terminologie@ivdnt.org terminologie@ivdnt.org]. | * [https://termtreffer.org/ Termtreffer]. Ask for login at [mailto:terminologie@ivdnt.org terminologie@ivdnt.org]. | ||
* [https://lt3.ugent.be/dterminer D-Terminer demo]. Terminology extraction for Dutch, English, French and German. (Rigouts Terryn, A. (2021). D-TERMINE: Data-driven Term Extraction Methodologies Investigated [Doctoral thesis]. Ghent University.) | * [https://lt3.ugent.be/dterminer D-Terminer demo]. Terminology extraction for Dutch, English, French and German. (Rigouts Terryn, A. (2021). D-TERMINE: Data-driven Term Extraction Methodologies Investigated [Doctoral thesis]. Ghent University.) | ||
===Terminology management=== | ===Terminology management=== | ||
* [https://iate.europa.eu/home IATE] (Interactive Terminology for Europe) is the EU's terminology management system. It’s the shared terminology management system of the institutions of the European Union and it contains more than 7 million terms in 26 languages covering more than 100 domains of the EU legislation. | * [https://iate.europa.eu/home IATE] (Interactive Terminology for Europe) is the EU's terminology management system. It’s the shared terminology management system of the institutions of the European Union and it contains more than 7 million terms in 26 languages covering more than 100 domains of the EU legislation. | ||
===Other=== | ===Other=== | ||
* Previously unmentioned [[CLARIN projects]] at INT | * Previously unmentioned [[CLARIN projects]] at INT | ||
Line 142: | Line 143: | ||
* [https://www.audacityteam.org/ Audacity] is an audio recording and editing software application that is open source. | * [https://www.audacityteam.org/ Audacity] is an audio recording and editing software application that is open source. | ||
==Helpdesk== | ==Helpdesk== | ||
For information about Dutch: If you cannot find the answers to your questions on this wiki, you can send your question to [mailto://servicedesk@ivdnt.org servicedesk@ivdnt.org ]. Your questions will be forwarded as soon as possible to the appropriate experts and you should receive an answer within two working days. | For information about Dutch: If you cannot find the answers to your questions on this wiki, you can send your question to [mailto://servicedesk@ivdnt.org servicedesk@ivdnt.org ]. Your questions will be forwarded as soon as possible to the appropriate experts and you should receive an answer within two working days. | ||
You can also ask us for information and assistance with the use of data and tools. | You can also ask us for information and assistance with the use of data and tools. | ||
==Other Services== | ==Other Services== | ||
* [[Best practice documents and guidelines]] | * [[Best practice documents and guidelines]] | ||
Line 156: | Line 154: | ||
* [[CLARIN]] for Dutch | * [[CLARIN]] for Dutch | ||
==Questions and Answers== | ==Questions and Answers== | ||
On the [[Q&A|Questions and Answers page]] we keep track of all questions we receive concerning Dutch. This will grow into a repository of K-Dutch answers to your questions. | On the [[Q&A|Questions and Answers page]] we keep track of all questions we receive concerning Dutch. This will grow into a repository of K-Dutch answers to your questions. |
Revision as of 13:05, 11 March 2024
Mediawiki:Mainpage
Welcome to K-Dutch, the place for anyone who wants to know anything about the Dutch language: linguistic properties, language advice, available tools and resources, etymology, dialects...
K-Dutch is a CLARIN Knowledge Centre. It is hosted by the Instituut voor de Nederlandse Taal (Dutch Language Institute) , which is also a CLARIN-B centre and host of many resources for Dutch, which are, in general, freely available for research purposes. K-Dutch is an initiative of CLARIN-ERIC and CLARIN-BE.
The status of Dutch with respect to language technologies is described in
- Short version: Steurs, Vandeghinste and Daelemans (2023). Language Report Dutch. In : Rehm, G., Way, A. (eds) European Language Equality. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-031-28819-7_12
- Longer version: Steurs, Vandeghinste and Daelemans (2022). Report on Dutch. Project deliverable. European Language Equality.
You are most welcome to contribute to these pages, please contact servicedesk@ivdnt.org with as subject K-Dutch, and we will be in touch.
Linguisitic topics
Grammar
- Phonology, Morphology and Syntax: Taalportaal
- Morphosyntax
- Syntactic Atlas of the Dutch dialects (SAND)
- Dutch descriptive grammar: e-ANS (in Dutch)
- Grambank
Lexicography
- Dutch dictionaries
- The Elexis Project
- White paper: The Future of Academic Lexicography
Terminology
- The Centre of Expertise for Dutch Terminology
- Academic Language
- Medical Terminology
- Dutch as a scientific language
- Legal Terminology
Spelling
Linguistic resources: datasets
Corpora
- Newspaper corpora: corpora exclusively consisting of newspaper text
- Parliamentary corpora
- Computer-mediated communication corpora
- Corpora of academic texts
- Historical corpora
- L2 learner corpora
- Manually annotated corpora
- Multimodal corpora
- Parallel corpora
- Reference corpora
- Social media corpora
- Spoken corpora
- Sign Language corpora
- Propbanks: contains semantic role labels
- Treebanks
- Other corpora
Lexical Resources
N-grams
Tools for Dutch
Normalisation
- Format conversion
- Spell checking
- TiCCLops: Text-Induced Corpus Clean-up online processing system: no longer available
- Normalisation Demo
Language Learning
- Schrijfassistent
- Schrijfassistent at De Standaard
- NedBox: Online exercises to learn Dutch
- Oefenen.nl: Online exercises to learn Dutch
- Woordcombinaties: Verbs and their combination patterns
- Orient+: A serious game to enhance academic vocabulary
- Taalwinkel: Language Advice
Automatic linguistic annotation
Speech processing
- Spoken Language Recognition
- Speech recognition
- Speech synthesis
Natural Language Processing
- Language Modeling
- Machine translation
- Coreference resolution
- Compound splitting
- Word Sense Disambiguation
- Text classification
- Sentiment analysis
- Readability
- Clinical NLP
Resource querying
Machine translation
Translation Engines
Publicly available machine translation engines from or to Dutch:
- DeepL
- Google translate
- Bing Microsoft translator
- Reverso
- eTranslation from the European Union
- MATEO No Language Left Behind
MT Evaluation
Terminology extraction
- Termtreffer. Ask for login at terminologie@ivdnt.org.
- D-Terminer demo. Terminology extraction for Dutch, English, French and German. (Rigouts Terryn, A. (2021). D-TERMINE: Data-driven Term Extraction Methodologies Investigated [Doctoral thesis]. Ghent University.)
Terminology management
- IATE (Interactive Terminology for Europe) is the EU's terminology management system. It’s the shared terminology management system of the institutions of the European Union and it contains more than 7 million terms in 26 languages covering more than 100 domains of the EU legislation.
Other
- Previously unmentioned CLARIN projects at INT
- Language and Speech Tools at Radboud Nijmegen. e.g. T-scan, an analysis tool for dutch texts to assess the complexity of the text.
- OpeNER is a language analysis toolchain helping (academic) researchers and companies make sense out of natural language analysis”. It consist of easy to install, improve and configure components to e.g. detect the language of a text, determine polarisation of texts (sentiment analysis), detect what topics are included in the text,... The supported language set currently consists of: English, Spanish, Italian, German and Dutch.
- GATE (General Architecture for Text Engineering) is a Java suite of tools originally developed at the University of Sheffield and it is used for many natural language processing tasks, including information extraction. (Dutch services in GATE Cloud).
- Speech Repository is an online e-learning tool. It contains video recordings of real-life speeches and tailor-made pedagogical material speeches which give the interpreter and interpreting students an opportunity to practise and improve their interpretation skills.
- Subtitle Workshop is a free application for creating, editing, and converting text-based subtitle files.
- YouDescribe is a free, web-based platform for adding audio description to YouTube content.
- Audacity is an audio recording and editing software application that is open source.
Helpdesk
For information about Dutch: If you cannot find the answers to your questions on this wiki, you can send your question to servicedesk@ivdnt.org . Your questions will be forwarded as soon as possible to the appropriate experts and you should receive an answer within two working days.
You can also ask us for information and assistance with the use of data and tools.
Other Services
Questions and Answers
On the Questions and Answers page we keep track of all questions we receive concerning Dutch. This will grow into a repository of K-Dutch answers to your questions.