Children's language: Difference between revisions

← Older edit Newer edit →

VisualWikitext

Revision as of 14:05, 11 December 2024

Jasmin Speech corpus

See spoken corpora

BasiLex-corpus

The Basilex corpus is an annotated collection of texts written for children in the age from four to twelve years.

BasiScript-corpus

The BasiScript Corpus is an annotated collection of texts written by children in the age from four to twelve years.

version 1.0 (2015)
Project page
Download page

CHILDES

CHILDES contains a large collection of corpora, which are datasets of transcripts of child-adult interactions, typically annotated and searchable. These include conversations, storytelling, and other linguistic exchanges, gathered from children of various languages, ages, and contexts. A login is required.

index to CHILDES data from Dutch and Afrikaans.
browse the Dutch database online

Subcorpora:

Dutch-English De Houwer Corpus: the study focuses on dialect features unique to the Antwerp area

The Asymmetries Project collection contains Dutch language productions gathered in Groningen and neighboring towns in the northern Netherlands, between 2007 and 2012

Aarssen/Bos This database contains 1021 transcripts collected in the Netherlands, Turkey, and Morocco by Jeroen Aarssen and Petra Bos, at Tilburg University. Bilingual data (either Turkish-Dutch or Moroccan Arabic-Dutch) were collected within the framework of a longitudinal study into development of bilingualism among Turkish and Moroccan children in the Netherlands.

Revision as of 14:04, 11 December 2024 view source Vincent (talk \| contribs) Bureaucrats, Administrators 1,479 edits →CHILDES ← Older edit		Revision as of 14:05, 11 December 2024 view source Vincent (talk \| contribs) Bureaucrats, Administrators 1,479 edits →CHILDES Newer edit →
Line 22:		Line 22:

	CHILDES contains a large collection of corpora, which are datasets of transcripts of child-adult interactions, typically annotated and searchable. These include conversations, storytelling, and other linguistic exchanges, gathered from children of various languages, ages, and contexts. A login is required.		CHILDES contains a large collection of corpora, which are datasets of transcripts of child-adult interactions, typically annotated and searchable. These include conversations, storytelling, and other linguistic exchanges, gathered from children of various languages, ages, and contexts. A login is required.


	*[https://childes.talkbank.org/access/DutchAfrikaans/ index to CHILDES data] from Dutch and Afrikaans.		*[https://childes.talkbank.org/access/DutchAfrikaans/ index to CHILDES data] from Dutch and Afrikaans.