Jump to content

Translations:Manually annotated corpora/13/en: Difference between revisions

From Clarin K-Centre
FuzzyBot (talk | contribs)
Importing a new version from external source
 
(No difference)

Latest revision as of 14:34, 14 March 2024

Information about message (contribute)
This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.
Message definition (Manually annotated corpora)
==Dutch Archaeology NER Training Dataset==
A manually annotated NER dataset, consisting of Dutch archaeological excavation reports. The following entity types are labelled: Artefacts, Time periods, Materials, Places (geographical locations), Archaeological contexts and Species.
The dataset is provided in the BIO format, with each token on 1 line and empty lines denoting sentence boundaries. On each line you can find the token, PoS tag, morphological segmentation and finally the label, separated by spaces. The PoS tag and morphological segmentation are assigned by Frog.

Dutch Archaeology NER Training Dataset

A manually annotated NER dataset, consisting of Dutch archaeological excavation reports. The following entity types are labelled: Artefacts, Time periods, Materials, Places (geographical locations), Archaeological contexts and Species. The dataset is provided in the BIO format, with each token on 1 line and empty lines denoting sentence boundaries. On each line you can find the token, PoS tag, morphological segmentation and finally the label, separated by spaces. The PoS tag and morphological segmentation are assigned by Frog.