Jump to content

Translations:Q&A/94/en

From Clarin K-Centre
Revision as of 14:19, 5 July 2024 by FuzzyBot (talk | contribs) (Importing a new version from external source)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The download files of the Corpus Spoken Dutch (CGN) do not contain the text only. The ort files contain ortographic transcriptions and timestamps and the plk files contain part-of-speech and lemma information. The following perl script takes a list of plk files as input and prints the text. If you run this script from the command line in your terminal, then you can create text files.