Parliamentary corpora: Difference between revisions

From Clarin K-Centre
Jump to navigation Jump to search
No edit summary
No edit summary
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
We currently have no specific parliamentary corpora available, but are working on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible.  
<languages/>
<translate>
<!--T:1-->
We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of [https://www.clarin.eu/content/parlamint-towards-comparable-parliamentary-corpora ParlaMint], a project that aims to bring together as many parliamentary corpora of different European languages as possible.  


To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT will implement this for the bilingual [https://www.dekamer.be/kvvcr/index.cfm Belgian Federal Parliament (French & Dutch)]. The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.
<!--T:2-->
To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT has implemented this for the bilingual [https://www.dekamer.be/kvvcr/index.cfm Belgian Federal Parliament (French & Dutch)]. The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.


* Multilingual comparable data set available at [https://www.clarin.si/repository/xmlui/handle/11356/1432]
<!--T:3-->
* [https://www.clarin.si/repository/xmlui/handle/11356/1432 Multilingual comparable data set available]


== European Parliament data==
== European Parliament data== <!--T:4-->


[https://opus.nlpl.eu/Europarl.php Europarl data] on the OPUS website
<!--T:5-->
[https://opus.nlpl.eu/Europarl.php Europarl data] on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.
</translate>

Latest revision as of 13:15, 13 March 2024

Other languages:

We currently have no specific Dutch parliamentary corpora available, but have worked on this topic in the framework of ParlaMint, a project that aims to bring together as many parliamentary corpora of different European languages as possible.

To this end, the different datasets must be converted to a uniform format and provided with linguistic information. The INT has implemented this for the bilingual Belgian Federal Parliament (French & Dutch). The aim of the project is to provide suitable research data for targeted observations of trends, opinions and decision-making. This will be tested by conducting a case study of the debate on the COVID-19 epidemic.

European Parliament data

Europarl data on the OPUS website: a parallel corpus extracted from the European Parliament web site by Philipp Koehn (University of Edinburgh). The main intended use is to aid statistical machine translation research.