Newspaper corpora

From Clarin K-Centre
Revision as of 09:57, 2 March 2021 by Vincent (talk | contribs)
Jump to navigation Jump to search

Newspaper corpora are corpora which exclusively consist of newspaper material.

SumNL: summary-corpus

The SumNL summary corpus is based on 30 clusters. Each cluster consists of a topic and 5-25 newspaper articles relevant to the topic. For each cluster two summaries of different sizes and also extracts consisting of ten sentences from the texts were made.