Translations:Simplification Data/30/en: Difference between revisions

Latest revision as of 10:29, 3 December 2024

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (Simplification Data)

The Synthetic Simplification Dataset was compiled within the Duidelijke Taal project and is based on the WR-P-E-I component (websites) of the SoNaR corpus. The dataset consists of three parts: 6,986 sentences from the SoNaR corpus, a synthetic simplification of the SoNaR sentences created by GPT-4 and sentence pairs consisting of one SoNaR sentence and its simplified version each.