Reference corpora: Difference between revisions
Jump to navigation
Jump to search
(Created page with "== Corpus Hedendaags Nederlands == A collection of more than 800,000 texts taken from newspapers, magazines, news broadcasts and legal writings (1814-2013). The corpus is a c...") |
No edit summary |
||
Line 5: | Line 5: | ||
===[http://chn.ivdnt.org/ Online search]=== | ===[http://chn.ivdnt.org/ Online search]=== | ||
== Lassy Large == | |||
The Lassy Large Corpus is a collection written texts consisting of approximately 700 million words with automatically generated annotations. | |||
The lemmas and POS-tags were generated with Tadpole (now Frog) and the syntactical dependency structures were generated with Alpino. | |||
*[https://taalmaterialen.ivdnt.org/download/tstc-lassy-groot-corpus Download] | |||
*[https://paqu.let.rug.nl:8068/ Online treebank search] |
Revision as of 15:09, 2 March 2021
Corpus Hedendaags Nederlands
A collection of more than 800,000 texts taken from newspapers, magazines, news broadcasts and legal writings (1814-2013).
The corpus is a combination of the 5, 27 and 38 Million Words Corpora and the PAROLE Corpus, supplemented with newspaper texts from NRC and De Standaard (until 2013).
Online search
Lassy Large
The Lassy Large Corpus is a collection written texts consisting of approximately 700 million words with automatically generated annotations. The lemmas and POS-tags were generated with Tadpole (now Frog) and the syntactical dependency structures were generated with Alpino.