Jump to content

Children's language: Difference between revisions

From Clarin K-Centre
No edit summary
 
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
<translate>
<translate>
==Jasmin Speech corpus==
==JASMIN Speech corpus==


[https://kdutch.ivdnt.org/wiki/Spoken_corpora#JASMIN-spraakcorpus See spoken corpora]
[https://kdutch.ivdnt.org/wiki/Spoken_corpora#JASMIN-spraakcorpus See spoken corpora]
Line 21: Line 21:
==CHILDES==
==CHILDES==


CHILDES contains a large collection of corpora, which are datasets of transcripts of child-adult interactions, typically annotated and searchable. These include conversations, storytelling, and other linguistic exchanges, gathered from children of various languages, ages, and contexts.
CHILDES contains a large collection of corpora, which are datasets of transcripts of child-adult interactions, typically annotated and searchable. These include conversations, storytelling, and other linguistic exchanges, gathered from children of various languages, ages, and contexts. A login is required.


*[https://childes.talkbank.org/access/DutchAfrikaans/ index to CHILDES data] from Dutch and Afrikaans.  
*[https://childes.talkbank.org/access/DutchAfrikaans/ index to CHILDES data] from Dutch and Afrikaans.  
Line 27: Line 27:


Subcorpora:  
Subcorpora:  
*[https://childes.talkbank.org/access/Biling/DeHouwer.html Dutch-English De Houwer Corpus]
*[https://childes.talkbank.org/access/Biling/DeHouwer.html Dutch-English De Houwer Corpus]: the study focuses on dialect features unique to the Antwerp area


*[https://childes.talkbank.org/access/DutchAfrikaans/Asymmetries.html The Asymmetries Project] collection contains Dutch language productions gathered in Groningen and neighboring towns in the northern Netherlands, between 2007 and 2012


*[https://childes.talkbank.org/access/Frogs/Dutch-AarssenBos.html Aarssen/Bos] This database contains 1021 transcripts collected in the Netherlands, Turkey, and Morocco by Jeroen Aarssen and Petra Bos, at Tilburg University. Bilingual data (either Turkish-Dutch or Moroccan Arabic-Dutch) were collected within the framework of a longitudinal study into development of bilingualism among Turkish and Moroccan children in the Netherlands.
</translate>
</translate>

Latest revision as of 16:13, 6 February 2025

JASMIN Speech corpus

See spoken corpora

BasiLex-corpus

The Basilex corpus is an annotated collection of texts written for children in the age from four to twelve years.

BasiScript-corpus

The BasiScript Corpus is an annotated collection of texts written by children in the age from four to twelve years.

CHILDES

CHILDES contains a large collection of corpora, which are datasets of transcripts of child-adult interactions, typically annotated and searchable. These include conversations, storytelling, and other linguistic exchanges, gathered from children of various languages, ages, and contexts. A login is required.

Subcorpora:

  • The Asymmetries Project collection contains Dutch language productions gathered in Groningen and neighboring towns in the northern Netherlands, between 2007 and 2012
  • Aarssen/Bos This database contains 1021 transcripts collected in the Netherlands, Turkey, and Morocco by Jeroen Aarssen and Petra Bos, at Tilburg University. Bilingual data (either Turkish-Dutch or Moroccan Arabic-Dutch) were collected within the framework of a longitudinal study into development of bilingualism among Turkish and Moroccan children in the Netherlands.