A couple of years ago, there was a minor to-do in popular media on a study of speech rates. Scientific American, Time, The Economist, and many other outfits reported on a paper in the journal Language that seemed to indicate basic differences in speaking rates between languages. Apparently, Mandarin is the slowest, and Japanese is the fastest, and they decided that this is probably due to differences in sound inventories. It makes intuitive sense, and fits in nicely with the basic concepts in Information Theory. Listeners probably have roughly the same capacity for understand a certain amount of information in a given time frame, so If a single syllable carries less information, you’d want to increase the speaking rate in order to come up close to that capacity limit.
Yeah, but the devil’s in the details. There are some serious flaws in this study, so whether the hypothesis is correct or not, I don’t think the Language study provides nearly enough evidence for the rather strong claims that came out of popular media. What else is new, right?
First off, everything was done from a translation of a single, short English text. This is a damned small sample to serve as a basis for a theory of cross-linguistic information density on five sentences.
And then consider that it’s been translated from just one language, rather than, say, multiple passages written in multiple languages translated into each other. Why not have a passage each, one translated from Mandarin, one from Japanese, and so on? Japanese people don’t talk about the same things that English speakers do, they don’t structure their speech acts in the same ways, they don’t tell stories with the same structure. So right out of the gate, you’re not comparing natural, spoken Japanese with English, but Japanese that’s been structured along English lines.
More troubling for me, though, was the overall quality of the translation. Here’s the original English passage:
Last night I opened the front door to let the cat out. It was such a beautiful evening that I wandered down the garden for a breath of fresh air. Then I heard a click as the door closed behind me. I realised I’d locked myself out. To cap it all, I was arrested while I was trying to force the door open!
Now, I’m not qualified to comment on the other languages, but here’s the text that they used for the Japanese sample:
昨夜、私は猫Weにだしてやるために玄関を開けてみると、あまりに気持のいい夜だったので、新鮮な空気をす吸おうと、ついふらっと庭へ降りたのです。する と後ろでドアが閉まって、カチャと言う音が聞こえ、自分自身を締め出してしまったことに気が付いたのです。挙句の果てに、私は無理矢理ドアをこじ開けよう としているところを逮捕されてしまったのです。
All errors presented as in the published paper. There are a few annoyances here, but the biggest for me is the unnecessary inclusion of words and morphemes that pad the length. They’ve put 私は at the beginning of two sentences, adding four grammatically unnecessary and unnatural syllables each time. There’s also a lot of use of してしまった, which I’ll agree is probably appropriate to the intended tone, but it’s also a nuance that isn’t really present in the English version and again adds four moras (three syllables). ‘To cap it all’ is translated as 挙句の果てに (ageku no hate ni, seven syllables) where 更に (sara ni, three syllables) would be completely adequate. All three sentences end with のです, adding three syllables, but you could drop that ending entirely and maintain the same meaning with a less explanatory/polite tone, which I think is absent from the English original anyway. I’ll just assume the ‘We’ is a typo, or 文字化け or something.
I’ll grant that it’s an attractive idea, and there’s some intuitive merit to the idea that languages would differ on syllable rate in order to meet a more or less constant rate of information transfer. That seems logical. I just don’t think this paper demonstrates that sufficiently. If you want to make some kind of claim about Japanese speech rate, you need more than three poorly translated sentences.