Parliamentary corpora: Difference between revisions

From Clarin K-Centre
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
<translate>
We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible.  
We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible.  


Line 8: Line 9:


[https://opus.nlpl.eu/Europarl.php Europarl data] on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
[https://opus.nlpl.eu/Europarl.php Europarl data] on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
</translate>

Revision as of 12:42, 13 March 2024

We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of ParlaMint, a project that aims to bring together as many parliamentary corpora of different European languages as possible.

To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT has implemented this for the bilingual Belgian Federal Parliament (French & Dutch). The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.

European Parliament data

Europarl data on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.