Children's language

From Clarin K-Centre
Revision as of 14:04, 11 December 2024 by Vincent (talk | contribs) (→‎CHILDES)
Jump to navigation Jump to search

Jasmin Speech corpus

See spoken corpora

BasiLex-corpus

The Basilex corpus is an annotated collection of texts written for children in the age from four to twelve years.

BasiScript-corpus

The BasiScript Corpus is an annotated collection of texts written by children in the age from four to twelve years.

CHILDES

CHILDES contains a large collection of corpora, which are datasets of transcripts of child-adult interactions, typically annotated and searchable. These include conversations, storytelling, and other linguistic exchanges, gathered from children of various languages, ages, and contexts. A login is required.


Subcorpora:

  • The Asymmetries Project collection contains Dutch language productions gathered in Groningen and neighboring towns in the northern Netherlands, between 2007 and 2012
  • Aarssen/Bos This database contains 1021 transcripts collected in the Netherlands, Turkey, and Morocco by Jeroen Aarssen and Petra Bos, at Tilburg University. Bilingual data (either Turkish-Dutch or Moroccan Arabic-Dutch) were collected within the framework of a longitudinal study into development of bilingualism among Turkish and Moroccan children in the Netherlands.