Parliamentary corpora/en: Difference between revisions
Updating to match new version of source page |
Updating to match new version of source page |
||
| Line 1: | Line 1: | ||
<languages/> | |||
We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible. | We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible. | ||
| Line 7: | Line 8: | ||
== European Parliament data== | == European Parliament data== | ||
[https://opus.nlpl.eu/Europarl | [https://opus.nlpl.eu/Europarl/corpus/version/Europarl Europarl data] on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research. | ||
Latest revision as of 17:49, 13 November 2025
We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of ParlaMint, a project that aims to bring together as many parliamentary corpora of different European languages as possible.
To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT has implemented this for the bilingual Belgian Federal Parliament (French & Dutch). The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.
European Parliament data
Europarl data on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.