Jump to content

Translations:Embeddings/7/en

From Clarin K-Centre
Revision as of 12:33, 26 March 2024 by FuzzyBot (talk | contribs) (Importing a new version from external source)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

GeenStijl.nl embeddings

GeenStijl.nl embeddings contains over 8M messages from the controversial Dutch websites GeenStijl and Dumpert to train a word embedding model that captures the toxic language representations contained in the dataset. The trained word embeddings (±150MB) are released for free and may be useful for further study on toxic online discourse.