On Parsing CHILDES

Laakso, Aarre (2005) On Parsing CHILDES. [Preprint] (Unpublished)

Full text available as:



Research on child language acquisition would benefit from the availability of a large body of syntactically parsed utterances between parents and children. We consider the problem of generating such a ``treebank'' from the CHILDES corpus, which currently contains primarily orthographically transcribed speech tagged for lexical category.

Item Type:Preprint
Additional Information:Submitted to Midwest Computational Linguistics Colloquium (MCLC) 4/10/2005.
Keywords:language acquisition, syntactic parsing, CHILDES database
Subjects:Linguistics > Computational Linguistics
ID Code:4204
Deposited By:Laakso, Aarre
Deposited On:12 Apr 2005
Last Modified:11 Mar 2011 08:55

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

Mark C. Baker. 2005. Mapping the terrain of language learning. Language Learning and Development, 1(1):93–129.

Daniel M. Bikel. 2002. Design of a multi-lingual, parallel-processing statistical parsing engine. In Proceedings of the Human Language Technology Conference, San Diego.

Edward Briscoe and J. Carroll. 2002. Robust accurate statistical annotation of general text. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), pages 1499–1504, Las Palmas, Canary Islands.

Eugene Charniak and Mark Johnson. 2001. Edit detection and parsing for transcribed speech. In Second Meeting of the North American Chapter of the Association for Computational Linguistics, pages 118–126.

Michael John Collins. 1999. Head-driven statistical models for natural language parsing. Ph.d. dissertation, University of Pennsylvania.

Stephen Crain and Paul Pietroski. 2002. Why language acquisition is a snap. Linguistic Review, 19(1–2):163–183.

J. J. Godfrey, E. C. Holliman, and J. McDaniel. 1992. Switchboard: telephone speech corpus for research and development. In 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-92), volume 1, pages 517–520, San Francisco.

Dennis Grinberg, John Lafferty, and Daniel Sleator. 1995. A robust parsing algorithm for link grammars. Technical Report CMU-CS-95–125, School of Computer Science, Carnegie-Mellon University.

Peter Lane and James Henderson. 2001. Incremental syntactic parsing of natural language corpora with simple synchrony networks. IEEE Transactions on Knowledge and Data Engineering, 13(2):219–231.

Brian MacWhinney. 2000. The CHILDES Project: Tools for Analyzing Talk, volume 2: The Database. Lawrence Erlbaum Associates, Mahwah, NJ, 3rd edition.

C. Parisse and M.-T. Le Normand. 2000. Automatic disambiguation of morphosyntax in spoken language corpora. Behavior Research, Methods, Instruments & Computers, 32:468–481.

Kenji Sagae, Brian MacWhinney, and Alon Lavie. 2004. Automatic parsing of parental verbal input. Behavior Research Methods, Instruments and Computers, 36(1):113–126.

XTAG Research Group. 2001. A lexicalized tree adjoining grammar for english. Technical report, IRCS, University of Pennsylvania.


Repository Staff Only: item control page