Translations:Q&A/94/en

From Clarin K-Centre
Revision as of 14:19, 5 July 2024 by FuzzyBot (talk | contribs) (Importing a new version from external source)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The download files of the Corpus Spoken Dutch (CGN) do not contain the text only. The ort files contain ortographic transcriptions and timestamps and the plk files contain part-of-speech and lemma information. The following perl script takes a list of plk files as input and prints the text. If you run this script from the command line in your terminal, then you can create text files.