Translations:Q&A/94/en
The download files of the Corpus Spoken Dutch (CGN) do not contain the text only. The ort
files contain ortographic transcriptions and timestamps and the plk
files contain part-of-speech and lemma information. The following perl script takes a list of plk files as input and prints the text. If you run this script from the command line in your terminal, then you can create text files.