Jump to content

Translations:Other corpora/39/en: Difference between revisions

From Clarin K-Centre
FuzzyBot (talk | contribs)
Importing a new version from external source
 
(No difference)

Latest revision as of 11:28, 21 March 2024

Information about message (contribute)
This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.
Message definition (Other corpora)
== Dutch Gigacorpus ==
With 234GB of varied plain text, and no fewer than 40 billion tokens, this is in any case one of the largest Dutch corpora. This corpus is also freely available and the quality is relatively high for its size, care has been taken to ensure that the data is as clean as possible. Also, the corpus contains 400 million forum posts in 10 million threads with their timestamp intact for linguistic research.

Dutch Gigacorpus

With 234GB of varied plain text, and no fewer than 40 billion tokens, this is in any case one of the largest Dutch corpora. This corpus is also freely available and the quality is relatively high for its size, care has been taken to ensure that the data is as clean as possible. Also, the corpus contains 400 million forum posts in 10 million threads with their timestamp intact for linguistic research.