Translations:Parallel Monolingual Corpora/19/en

From Clarin K-Centre
Jump to navigation Jump to search

3) The third dataset is the comparable corpus created by Nick Vanackere. It contains a comparable corpus of 12,687 Wablieft articles between 2012-2017 from 206,466 De Standaard articles from 2013-2017. To ensure comparability, only articles from 08/01/2013 till 16/11/2017 were considered, resulting in 8,744 Wablieft articles and 202,284 De Standaard articles. The difference in the number of articles is due to the publication frequency, with Wablieft being weekly and De Standaard daily.