This paper studies various aspects of child vocalization as captured in a newly established parallel corpus of sixteen 18.31 months old US and Shanghainese toddlers. The recordings were acquired in 16-hour sessions during an eordinaryf day in the child's natural environment and manually labeled. The vocalization characteristics are studied by means of phonotactic and prosodic analysis with emphasis on automatic processing. In the phonotactic domain, a Gaussian mixture model (GMM) tokenizer, a bank of phone recognizers, and formant tracking are used to analyze the movements in the acoustic-phonetic space. In the prosodic domain, pitch patterns, duration, and rhythm are analyzed. Besides strong individual-specific characteristics of the subjects in some of the domains considered, the two language groups show differences in the occupation of the F1 . F2 formant space, choice of pitch pattern durations, and consistency in producing complex phonetic patterns.
Bibliographic reference. Bořil, Hynek / Zhang, Qian / Angkititrakul, Pongtep / Hansen, John H. L. / Xu, Dongxin / Gilkerson, Jill / Richards, Jeffrey A. (2013): "A preliminary study of child vocalization on a parallel corpus of US and shanghainese toddlers", In INTERSPEECH-2013, 2405-2409.