Parliamentary corpora: Difference between revisions

From Clarin K-Centre
Jump to navigation Jump to search
mNo edit summary
No edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<languages/>
<translate>
<!--T:1-->
We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible.  
We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible.  


<!--T:2-->
To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT has implemented this for the bilingual [https://www.dekamer.be/kvvcr/index.cfm Belgian Federal Parliament (French & Dutch)]. The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.
To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT has implemented this for the bilingual [https://www.dekamer.be/kvvcr/index.cfm Belgian Federal Parliament (French & Dutch)]. The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.


<!--T:3-->
* [https://www.clarin.si/repository/xmlui/handle/11356/1432 Multilingual comparable data set available]
* [https://www.clarin.si/repository/xmlui/handle/11356/1432 Multilingual comparable data set available]


== European Parliament data==
== European Parliament data== <!--T:4-->


[https://opus.nlpl.eu/Europarl.php Europarl data] on the OPUS website
<!--T:5-->
[https://opus.nlpl.eu/Europarl.php Europarl data] on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
</translate>

Latest revision as of 13:15, 13 March 2024

Other languages:

We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of ParlaMint, a project that aims to bring together as many parliamentary corpora of different European languages as possible.

To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT has implemented this for the bilingual Belgian Federal Parliament (French & Dutch). The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.

European Parliament data

Europarl data on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.