Translations:Reference corpora/8/en: Difference between revisions
Appearance
Importing a new version from external source |
(No difference)
|
Latest revision as of 15:57, 19 March 2024
SoNaR-500 contains more than 500 million words of text from various domains and genres. All texts were tokenized, POS tagged and lemmatized. The named entities were also labeled. All SoNaR-500 annotations were generated automatically.