Embeddings: Difference between revisions

Revision as of 09:23, 4 March 2022

Word2Vec embeddings

Repository for the word embeddings described in Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource, presented at LREC 2016.

Download page

BERT embeddings

GeenStijl.nl embeddings

GeenStijl.nl embeddings contains over 8M messages from the controversial Dutch websites GeenStijl and Dumpert to train a word embedding model that captures the toxic language representations contained in the dataset. The trained word embeddings (±150MB) are released for free and may be useful for further study on toxic online discourse.

@@ Line 1: / Line 1: @@
-* Word2Vec embeddings: https://github.com/clips/dutchembeddings
+== Word2Vec embeddings==
-* BERT embeddings
-** [https://arxiv.org/abs/1912.09582 BERTje]
+Repository for the word embeddings described in Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource, presented at LREC 2016.
-** [https://people.cs.kuleuven.be/~pieter.delobelle/robbert/ RobBERT]
+* [https://github.com/clips/dutchembeddings Download page]
+==BERT embeddings==
+*[https://arxiv.org/abs/1912.09582 BERTje]
+*[https://people.cs.kuleuven.be/~pieter.delobelle/robbert/ RobBERT]
+==GeenStijl.nl embeddings ==
+GeenStijl.nl embeddings contains over 8M messages from the controversial Dutch websites GeenStijl and Dumpert to train a word embedding model that captures the toxic language representations contained in the dataset. The trained word embeddings (±150MB) are released for free and may be useful for further study on toxic online discourse.
+*[https://www.textgain.com/portfolio/geenstijl-embeddings/ Project page]
+*[https://www.textgain.com/wp-content/uploads/2021/06/TGTR4-geenstijl.pdf Report]

Embeddings: Difference between revisions

Revision as of 09:23, 4 March 2022

Word2Vec embeddings

BERT embeddings

GeenStijl.nl embeddings

Navigation menu

Search