Jump to content

Compound splitting: Difference between revisions

From Clarin K-Centre
No edit summary
Marked this version for translation
 
Line 10: Line 10:
*[https://lt3.ugent.be/compound-splitter-demo/ Demo]
*[https://lt3.ugent.be/compound-splitter-demo/ Demo]


<!--T:3-->
==CharSplit - An ngram-based compound splitter==
==CharSplit - An ngram-based compound splitter==
Python module that splits a compound into its body and head. So far German and Dutch are supported.
Python module that splits a compound into its body and head. So far German and Dutch are supported.


<!--T:4-->
*[https://pypi.org/project/compound-split/ Webpage]
*[https://pypi.org/project/compound-split/ Webpage]


<!--T:5-->
==Wordbuilder==
==Wordbuilder==
*[https://www.aclweb.org/anthology/L02-1004/ Vincent Vandeghinste (2002). Lexicon Optimization: Maximizing Lexical Coverage in Speech Recognition through Automated Compounding.] Proceedings of the Third International Conference on Language Resources and Evaluation (LREC2002). ELRA. Paris.
*[https://www.aclweb.org/anthology/L02-1004/ Vincent Vandeghinste (2002). Lexicon Optimization: Maximizing Lexical Coverage in Speech Recognition through Automated Compounding.] Proceedings of the Third International Conference on Language Resources and Evaluation (LREC2002). ELRA. Paris.
</translate>
</translate>

Latest revision as of 17:20, 3 April 2025

Compound splitter demo

A compound splitter splits compounds into their component parts, e.g. liefde+s+drank or [post+zegel]+verzamelaar. This demo allows Dutch input up to 500 characters. You can either input running text or single words (one word per line). If you are interested in using the compound splitter for other purposes contact Lieve.Macken@UGent.be.

CharSplit - An ngram-based compound splitter

Python module that splits a compound into its body and head. So far German and Dutch are supported.

Wordbuilder