==I am looking for a parallel corpus of Dutch-Turkish texts.==
==I am looking for a parallel corpus of Dutch-Turkish texts.==
We are comparing the Dutch and Turkish translations of the Linguistic Inquiry and Word Count [LIWC] dictionaries. Do you know of any corpora that would be suitable?
I found several candidates on OPUS (https://opus.nlpl.eu/), and downloaded the TED2020 talks. However these are .xml files with paragraph/line IDs and I need .txt files. Would you have a script or a way to automatically recode them and remove the unnecessary tags?
Latest revision as of 14:19, 5 July 2024
I am looking for a parallel corpus of Dutch-Turkish texts.