Parallel corpora: Difference between revisions

From Clarin K-Centre
Jump to navigation Jump to search
No edit summary
 
(23 intermediate revisions by 5 users not shown)
Line 1: Line 1:
==The Dutch Parallel Corpus==
<languages/>
<translate>
<!--T:1-->
Parallel corpora are central to translation studies and contrastive linguistics. Many of the parallel corpora are accessible through easy-to-use concordancers which considerably facilitates the study of interlinguistic phenomena. Such corpora are also a rich source of materials for language teaching. Furthermore, parallel corpora serve as training data for machine translation systems.


The Dutch Parallel Corpus (DPC) is a 10-million-word, sentence-aligned parallel corpus for the language pairs Dutch-English and Dutch-French, with Dutch as the central language.
<!--T:2-->
Monolingual parallel corpora also exist, containing e.g. sentences and their paraphrases, or sentences and their simplified form.


The corpus contains five different text types and is balanced with respect to text type and translation direction. The entire corpus has been aligned at sentence level and further enriched with linguistic information (lemmas and PoS-tags). A small subset of the Dutch-English part has also been manually aligned at the sub-sentential level.
<!--T:3-->
*[[Parallel Multilingual Corpora]]


*[http://dpc.inl.nl/indexd.php Online search]
<!--T:4-->
*[http://hdl.handle.net/10032/tm-a2-h3 Download page]
*[[Parallel Monolingual Corpora]]
*[https://www.kuleuven-kulak.be/dpc/en/ Project website]
</translate>

Latest revision as of 15:50, 19 March 2024

Other languages:

Parallel corpora are central to translation studies and contrastive linguistics. Many of the parallel corpora are accessible through easy-to-use concordancers which considerably facilitates the study of interlinguistic phenomena. Such corpora are also a rich source of materials for language teaching. Furthermore, parallel corpora serve as training data for machine translation systems.

Monolingual parallel corpora also exist, containing e.g. sentences and their paraphrases, or sentences and their simplified form.