In infant-directed speech (IDS) adults exaggerate speech (extended pitch range, longer vowels, hyperarticulated vowels etc.) more than in adult-directed speech (ADS) (Burnham et al., 2002). While reports that such exaggerations occur across languages - Swedish, Russian, and English (Kuhl et al., 1997), Thai and English (Kitamura et al., 2002), and Japanese and English (Andruski et al., 1999) - argue for universality of this phenomenon, other results suggest cross-language differences less IDS vs. ADS pitch elevation in British than American English (Shute & Wheldall, 1989), and lack of IDS vs. ADS vowel space difference in Norwegian (Englund & Behne, 2005).
This study was motivated by the lack of universality in auditory-visual speech perception between Japanese and English adults. While there is pervasive influence of visual information in speech perception whenever it is available (McGurk & MacDonlad, 1977), this is stronger in English than Japanese adults (Sekiyama & Tohkura, 1993). This cross-language difference has its origins between 6 and 8 years when English, but not Japanese, children increase their use of visual speech information (Sekiyama & Burnham, 2004). Is it possible that this effect has its origin in parents speech to their in infants?
A Japanese mother and an Australian English mother were video-recorded speaking to their 5-month-olds, and an adult, using four dolls - Boobaa, Baabaa, Biibaa, Buubaa. Acoustic and phonetic analyses of these words showed that mothers produced longer vowels in IDS than ADS, with the difference most pronounced for /o:/ in English and /a:/ and /u:/ in Japanese. Mean and range fundamental frequency (pitch) was higher in IDS than ADS in English, but for Japanese only the mean was higher. Quadrilateral vowel areas calculated from the first and second formant values of the /o:/, /i:/, /a:/, and /u:/ vowels revealed vowel hyperarticulation in IDS both in Australian English and Japanese, though more so in Japanese. Not surprisingly, given the relative number of vowels (5 in Japanese, 14 in English), Japanese vowel space was generally larger, and interestingly, Japanese IDS showed less overlap between individual vowel ellipses than did English IDS. It is possible that the differences in Japanese and English use of visual information in speech perception are due to the greater ease with which Japanese vowels can be perceived via auditory-only information, thus requiring less use of visual information. Further data collection, and further analysis of the degree of visual hyperarticulation in speech is progressing.