Japanese Speech Rate: How Fast Do Native Speakers Actually Talk?

Japanese speech rate sits near the top of the cross-linguistic speed scale. In controlled reading, Japanese reaches roughly 7.84 syllables per second, statistically tied with Spanish for the lead in the seven-language sample of Pellegrino, Coupé, and Marsico.¹ In spontaneous conversation, the Corpus of Spontaneous Japanese measures about 8 morae per second.² What feels overwhelming is not information density but the mora-timed pulse. The gap between JLPT listening audio and an izakaya conversation is real and measurable.

Overview

What "speech rate" actually measures

Speech rate counts speech units over time. The unit has to be specified: syllables per second, words per minute, phonemes per second, or, for Japanese, morae per second (拍/秒 or モーラ/秒). Cross-language studies usually report syllables per second because syllables are robust and language-independent.¹

"Information rate" is separate: information density (bits per syllable) multiplied by syllabic rate (syllables per second) yields bits per second.¹ A fast syllabic rate paired with thin syllables can still produce a moderate information rate.

For Japanese, the meaningful counting unit is the mora, not the syllable. A mora is roughly one kana character. Small ゃ・ゅ・ょ form part of the preceding mora, while long-vowel second halves, the moraic nasal ん, and the geminate first half っ each count as one mora on their own.³

東京とうきょう³
"Tokyo." (four morae: と・う・きょ・う; only two syllables)

Kubozono treats the mora and syllable as distinct phonological units that co-exist in Japanese, with the mora serving as the timing unit.³ Vance describes each mora as occupying roughly one beat in spoken Japanese.⁴ The companion article on the mora-vs-syllable distinction explains this foundation in more detail.

Why this article uses morae per second

Otake, Hatano, Cutler, and Mehler demonstrated that Japanese listeners parse the speech stream in mora-sized units rather than syllable-sized units.⁵ Counting in morae aligns the measurement to the unit the native ear actually uses.

The three-tier scale this article will defend

Casual native conversation sits at roughly 7–8 morae per second, anchored to the Corpus of Spontaneous Japanese mean of 8.01 morae per second.² NHK news anchors read at a deliberately slower broadcast pace. This pace comes from the in-house target of 300 字/分 and corresponds to roughly 5–6 morae per second for typical news prose.⁶⁷ JLPT listening audio sits below NHK news at every level: the Japan Foundation describes N5 audio as "slowly and clearly spoken" and N1 audio as delivered at "natural speed," but no numerical target is published.⁸⁹

The three tiers are a learner number line, not a ranking. The point is to locate the current bottleneck on it.

Why this matters for the learner

The JLPT listening register sits below NHK news, and NHK news sits below ordinary conversation. The shadowing-pedagogy literature places the adult-learner bottleneck specifically in real-time bottom-up parsing at native rate, not in lexical or grammatical knowledge.¹⁰¹¹ Tamai's foundational 1992 study comparing shadowing and dictation reported significantly larger listening-proficiency gains from shadowing in lower-proficiency learners. That is the group most prone to the "wall of sound" experience.¹⁰

How fast is Japanese, really

The 7.84 syllables-per-second finding

Pellegrino, Coupé, and Marsico measured the syllabic rate of seven languages using the MULTEXT multilingual corpus. The data included 20 short texts, 59 adult speakers (29 male, 30 female), 585 recordings, and roughly 150 minutes of speech read at "normal" rates.¹

Language	Syllabic rate (syl/sec)	95% CI
Japanese	7.84	±0.09
Spanish	7.82	±0.16
French	7.18	±0.12
Italian	6.99	±0.23
English	6.19	±0.16
German	5.97	±0.19
Vietnamese	5.22	±0.08
Mandarin	5.18	±0.15

Source: Pellegrino, Coupé, and Marsico (2011), Table 2.¹

Japanese and Spanish form a statistically indistinguishable cluster at the top. Mandarin and Vietnamese sit at the bottom.¹

Read speech, not casual conversation

The 7.84 figure is from a controlled reading task. The authors note that read speech "most likely underestimated the natural variation encountered in social interactions."¹ For casual conversation, the Corpus of Spontaneous Japanese mean of 8.01 morae per second is the right anchor.²

Morae vs. syllables: the same data through a Japanese-native lens

The Corpus of Spontaneous Japanese (CSJ), maintained by NINJAL, is the standard quantitative reference for mora-per-second measurements in Japanese.¹²¹³ Maekawa reports a mean speaking rate of 8.01 morae per second across CSJ. The older ATR read-speech database has a mean of 7.11 morae per second.² By this measurement, spontaneous Japanese is faster than read Japanese.

Marushima reports independently that "ordinary" speech in healthy adult Japanese speakers lands at roughly 6–7 morae per second, "fast" speech at 9–10 morae per second, and "slow" speech below 5 morae per second.¹⁴ The two reference points do not agree perfectly. CSJ is dominated by academic-presentation and simulated-public-speaking material rather than free dialog, while Marushima's "ordinary" band reflects a more controlled register.

The ~7–8 morae per second tier this article uses for casual native speech brackets the gap between those two anchors.

A note on the conversion check

Japanese syllables in the Pellegrino corpus tend to be light: a single consonant plus vowel, or a bare vowel. In news-style prose, a Japanese syllable typically maps to about one mora because heavier mora structures (the moraic nasal ん, the geminate first half っ, and long-vowel second halves) are minorities in everyday vocabulary. The 7.84 syl/sec figure from controlled reading therefore aligns plausibly with the 8.01 mora/sec spontaneous figure once the register difference is taken into account.¹²

NHK news and other broadcast registers

NHK news anchors read at a standard pace of about 300–350 字/分 (characters per minute), with 300 字/分 widely treated as the in-house target shared among announcers, scriptwriters, and directors.⁶⁷

The conversion from 字/分 to morae per second is not direct. In NHK broadcast prose, each kana counts as one mora and each kanji corresponds to one to three morae depending on its reading (政府 = せ・い・ふ; 事故 = じ・こ). Kanji are the minority of characters in typical news copy, which is heavily padded with kana particles, suffixes, and function words. A weighted average of roughly 1.1 morae per character across mixed news prose yields:

The diagram traces the in-house 字/分 target through the character-to-mora ratio to the morae-per-second band used throughout this article. The 字/分 figure is sourced; the conversion is a defensible estimate, not a directly measured corpus value.⁶⁷

Commercial broadcasters routinely read faster, with figures exceeding 500 字/分 reported in industry summaries, which by the same conversion corresponds to roughly 9 morae per second and already overlaps the casual-conversation band.⁶⁷

JLPT listening: where the test sits on the scale

The Japan Foundation describes the listening register at each JLPT level only in qualitative terms:⁸⁹

Level	Official description of listening audio
N5	"slowly and clearly spoken short conversations and expressions"
N4	"slowly spoken everyday conversations"
N3	comprehension at "near-natural speed" (やや自然に近いスピード)
N2	"spoken at nearly natural speed"
N1	"spoken at natural speed"

No numerical mora-per-second or 字/分 target is published for any level.⁹ Popular pedagogical writing often treats N5 audio as a low-single-digit morae-per-second range and N1 audio as roughly broadcast-news pace. Those are observed conventions inferred from the official sample audio, not Japan Foundation specifications. This reading is consistent with the Japan Foundation's own language: "natural speed" rather than "casual conversational speed" places even N1 audio in the NHK-news register, not in the spontaneous-conversation register.⁸

JLPT listening is not casual-conversation listening

Even at N1, JLPT audio is engineered prose read with broadcast-grade articulation. Phrase boundaries are marked by silent pauses. Passing N1 listening confirms broadcast-pace comprehension; it does not confirm casual-conversation comprehension. The gap is structural, not personal.⁸

The information-rate equilibrium

Pellegrino, Coupé, and Marsico report an information density for Japanese of 0.49 (Vietnamese = 1.00 reference), the lowest in the seven-language sample. When multiplied by the high syllabic rate, the Japanese information rate is 0.74, also the lowest in the sample. The English information rate is 1.08.¹

The widely repeated reformulation is: "languages with thin syllables speed up; languages with dense syllables slow down; the product is approximately constant." The 2011 paper supports this with a statistically significant negative correlation between density and rate. However, the paper explicitly rejects the stricter claim of constant information rate across the seven languages. Japanese sits significantly below the other six on information rate.¹

The Coupé, Oh, Dediu, and Pellegrino 2019 follow-up extends the design to seventeen languages and 170 speakers. It concludes that information rates cluster near ~39 bits per second across this broader sample. That supports an "average information rate" convergence at the scale of the human linguistic niche, although individual languages still differ from the central value.¹⁵

The 39 bits/sec figure is from 2019, not 2011

The ~39 bits per second invariant is the Coupé et al. 2019 result on seventeen languages, not the Pellegrino et al. 2011 result on seven.¹¹⁵ The 2011 paper documents the syllabic-rate ranking that puts Japanese on top; the 2019 paper is the one that supports the cross-linguistic information-rate convergence.

Why Japanese feels faster than its information rate

Mora-timing as a metronome

Japanese has been described typologically as mora-timed since Bloch (1950), who treated the mora as the temporal unit of speech.¹⁶ Pike's earlier dichotomy of stress-timed versus syllable-timed languages was later extended to a third category, mora-timing, with Japanese as the standard example. Han (1962) provided the foundational empirical work on durational compensation within the Japanese mora.¹⁷¹⁸ The companion article on stress versus pitch in Japanese covers the typological contrast in more detail.

The strong-isochrony version of mora-timing (every mora has exactly equal duration) is not supported by modern measurements. Warner and Arai's review concludes that the perception of mora-isochrony in Japanese arises from phonological structure and listener categorization rather than from strict durational regularity in production.¹⁹

The weaker, well-supported claim is the one that matters for the learner ear: duration varies much less across Japanese morae than across English unstressed and stressed syllables. To an English-trained ear, Japanese speech arrives as a steady, fast pulse with no rest points.⁴¹⁹

Open syllables and the absence of consonant clusters

The Japanese mora inventory is dominated by open CV morae, meaning consonant + vowel, plus a small set of "deficient" morae (the moraic nasal ん, the geminate first half っ, and long-vowel second halves). Kubozono characterizes the prototypical Japanese mora as CV.³

Vance describes Japanese morae as built around open syllables with effectively no native consonant-cluster onsets. That contrasts with English syllables like "strengths" /strɛŋθs/ (one syllable, six segments) or "sixths" /sɪksθs/.⁴ The articulatory load that an English syllable can carry as a cluster has to spread across multiple morae in Japanese. Morae pile up faster than English syllables not because the speaker is rushing, but because each one is mechanically simpler and shorter.

No reduced vowels, but devoiced vowels still pass

Japanese does not have schwa-style vowel reduction in unstressed positions. Its five-vowel system is roughly stable in duration across positions. This contrasts with English, where unstressed syllables shorten and centralize toward schwa.⁴

High vowels /i/ and /u/ between voiceless consonants undergo regular devoicing in standard Tokyo Japanese. The devoiced segment retains its mora slot in timing but loses voicing, trimming acoustic mass without removing the unit. The mechanism is covered in the article on Japanese vowel devoicing.⁴

今日きょうは寒さむいです。⁴
"It is cold today."

The final です routinely devoices to [des]. The last "syllable" of the polite predicate can vanish acoustically while keeping its mora slot.

Sentence-final compression and the "wall of sound" effect

Japanese sentences cluster predicate inflection at the right edge: です・ます・ました・たい・みたい・ようだ・だろう and so on. These predicate suffixes are mora-heavy strings that ride a falling pitch contour at the end of the sentence.⁴ The article on Japanese sentence intonation covers that falling contour.

Devoicing applies especially often in the right-edge predicate cluster (です → [des], -ました → [maɕɯ̥ta]). As a result, the rightmost stretch of a Japanese sentence is both the fastest and the least acoustically prominent. The densest, lowest-prominence portion of the prosodic contour is precisely where the grammatical information lives.

行いきませんでした。⁴
"I did not go."

The eight-mora chain from verb stem through final inflection (い・き・ま・せ・ん・で・し・た) clusters on a falling pitch. The ません-でした tail is routinely the densest stretch of the sentence.

食たべたいみたいですね。⁴
"It seems they want to eat, doesn't it?"

Ten morae of stacked sentence-final modality (desiderative + evidential + copula + final particle) attach to a short verb stem. This is a textbook example of the predicate clusters Vance treats as the locus of perceived sentence-final speed-up.

雨あめが降ふったようです。⁴
"It seems to have rained."

The ようです tail is the right-edge cluster the learner ear has to parse last, after carrying the lexical content of 雨 and 降った through the rest of the sentence.

Pitch-flat speech without rest points

Japanese pitch accent is a high-versus-low pattern over morae, not a stress system with stressed syllables that get extra duration and intensity.⁴³ English uses long stressed beats as audible landmarks in the speech stream. Japanese has no analogous beat-anchored landmark. The beginner's guide to Japanese pitch accent covers the full mechanics.

For the learner ear, the consequence is clear: a Japanese phrase delivered at 7–8 morae per second arrives as a stream of comparably weighted units with a falling tail, not as a sequence of clearly punctuated stress beats.⁴¹⁹

What slows speech down (and what does not)

Register: keigo and customer-service speech

Honorific predicates and set service phrases are mora-heavy strings: いらっしゃいませ is six morae for one greeting; ありがとうございます is ten; ご利用ください is eight. Their length forces measurable slowdown relative to casual equivalents (どうも, ありがと, 使って).⁴

いらっしゃいませ。⁴
"Welcome."

ありがとうございます。⁴
"Thank you."

少々しょうしょうお待まちください。⁴
"Please wait a moment."

Marushima's "ordinary" 6–7 morae per second figure reflects controlled, formal-register reading; the CSJ 8.01 morae per second for spontaneous speech is roughly one to two morae per second faster.¹⁴² The drop between casual and formal is consistent across both reference points.

Genre: news vs. drama vs. comedy vs. variety

News anchors are the slow end of broadcast speech at roughly 300–350 字/分 (~5–6 morae per second under the conversion above).⁶⁷ Commercial-station fast-news exceeds 500 字/分 (~9 morae per second) and already overlaps conversational pace.⁶⁷

The CSJ corpus documents internal rate differences between academic-presentation speech (APS, 838 speakers, ~299.5 hours), simulated-public-speaking (SPS, 580 speakers, ~324.1 hours), and free dialog. Within APS, the engineering subset is measurably faster than humanities. Maekawa attributes the difference to time pressure under fixed presentation slots.²²⁰

Variety-show banter and stand-up comedy are the practical ceiling for sustained spoken Japanese. Marushima's 9–10 morae per second "fast speech" band corresponds to this register.¹⁴

Speaker variables: age, region, individual range

Iwamoto, Kikuchi, and Mazuka document that Japanese-speaking children reach adult-like word duration before age 11, with rhythmic-timing control continuing to develop until approximately age 13. Speech-rate values for fluent adult speakers stabilize in adolescence.²¹

Marushima's measurements treat 6–7 morae per second as the population "ordinary" band and 9–10 morae per second as the "fast" band. Within-speaker variation across emotional and pragmatic contexts can span much of that range.¹⁴

Kansai-as-faster is folk perception, not measured fact

The perception of Kansai dialect speech as faster than Tokyo standard circulates widely. However, no controlled mora-per-second comparison appears in the sources collected for this article. The perception is documented; the acoustic measurement is scarce. Treat the claim as folk perception unless a controlled regional-dialect rate study replaces it.

What does not slow speech down: speaking to a foreigner

The foreigner-directed-speech literature describes the changes native speakers make when addressing non-native listeners. Uther, Knoll, and Burnham report that foreigner-directed speech involves vowel-space expansion, higher pitch, slower rate, and lexical simplification compared to native-to-native speech.²²

The "slower rate" component is real but small in studies that control for setting. The larger changes are pitch elevation and lexical simplification. Expecting Japanese natives to slow down dramatically for a learner outside structured contexts overstates the rate-modification effect documented in the literature.²²

Closing the gap: training to native rate

The shadowing premise

Shadowing is real-time vocal repetition of an audio source with a short or zero lag. Tamai introduced shadowing to the Japanese-language pedagogy literature by comparing it with dictation. He reported statistically significant gains in listening proficiency from shadowing, especially among lower-proficiency learners.¹⁰ The companion article on daily Japanese pronunciation drills packages short-passage shadowing into a 5-minute routine.

Kadota framed shadowing as a workout for the phonological loop, the short-term auditory buffer that holds incoming sound material long enough to parse it. Simultaneous listening and articulation force the perceptual-motor system to operate at native timing, rather than at the slower comprehension-first pace of dictation or translation.²³

Hamada's procedural work packages shadowing into staged classroom protocols, with warm-up listening, mumbling, and full-voice synchronization stages.²⁴¹¹

Pick a tier, not a clip

The published shadowing procedures imply selection by tier rather than by topic interest. Both Tamai and Kadota stress that the audio must match the learner's current proficiency, not a level above it, because excess rate overloads the phonological loop.¹⁰²³

A practical tier ladder for Japanese learners starts with JLPT-listening audio, where publisher-provided practice audio is the standard source. It then moves through NHK News Web Easy, ordinary NHK news, drama or podcast dialog, and finally variety-show banter. The Kurosio "Shadowing: Let's Master Conversational Japanese" series sequences its audio by JLPT level (Beginner-to-Intermediate, then Intermediate-to-Advanced), which is one publisher implementation of the principle.¹⁰¹¹²³

The ladder is a sequence, not a menu. Skipping tiers tends to break the phonological loop the protocol relies on.

The three shadowing modes

Kadota and Hamada describe successive stages applied to the same clip, with terminology varying slightly between authors:²⁴¹¹²³

Mumbling or murmur shadowing. Subvocal, low volume. The learner focuses on tracking timing without comprehension pressure.
Synchronized or overlap shadowing. Full voice, in sync with the source. The learner matches rhythm and pitch.
Prosody or content shadowing with delay. One to two morae behind the source. The learner now holds material in the phonological loop long enough to test chunking.

In the shadowing-pedagogy literature, standard session length is roughly 15–20 minutes daily for measurable change. Kadota's three-to-five-hours-per-week recommendation corresponds to four or five 30-minute sessions.¹⁰²³

Track morae per second, not minutes per day

For rate-side listening training, the natural progress metric is morae per second on a known clip, not minutes per day of practice. The arithmetic is the same as the corpus method: time a 30-second clip, count morae (one per kana, plus ん, っ, and long-vowel second halves), and divide.²¹⁴

Mastery of a clip means clean overlap at native rate

"Mastery" in the shadowing literature usually means overlap-shadowing the full clip cleanly at native rate. Hamada's 2022 procedure ends each cycle with a recorded full-rate overlap attempt as the assessment.²⁴ When you can clear that bar, move up a tier.

Listening (vs. shadowing) at the next tier up

The separation between a passive-listening tier and an active-shadowing tier is conventional in the shadowing literature. Passive listening is set higher to expand exposure; shadowing is set lower to ensure articulatory precision.¹¹²³

Specific comprehension thresholds for tier promotion are pedagogical heuristics rather than measured results. A common framing is "moderate comprehension at the listen-only tier, near-complete comprehension at the shadow tier." Hamada and Kadota do not specify exact percentages.¹¹²³

The plateau and the variety-show test

The 9–10 morae per second "fast speech" band documented in Marushima corresponds to variety-show, comedy, and informal-monologue registers.¹⁴ Reliable comprehension at this tier is not a JLPT criterion. The JLPT tops out at the "natural-speed" register at N1, which sits comfortably below this ceiling.⁸⁹

The shadowing literature does not promise that any specific protocol bridges the gap from N1 audio to variety-show audio. Its standard claim is narrower: systematic rate-tier ladder work moves the learner up the rate axis, with the ceiling determined by sustained practice time and individual perceptual-motor adaptation.¹⁰²⁴²³

Good to know

Counting morae as if they were syllables

The most common counting mistake reduces 東京 to two morae by treating it as two syllables (とう・きょう). The true mora count is four: と・う・きょ・う. Long-vowel second halves and small ゃ・ゅ・ょ are independently moraic but not independently syllabic. If the convention is wrong, the rate measurement changes by nearly a factor of two.⁴³ Otake, Hatano, Cutler, and Mehler show that Japanese listeners themselves perceptually segment into morae rather than syllables, so the mora is the unit aligned to the native ear.⁵

東京とうきょう³
"Tokyo." (four morae: と・う・きょ・う)

Confusing "fastest language" with "highest information rate"

The inference "Pellegrino 2011 shows Japanese speakers transmit information faster than English speakers" reverses the actual finding. Pellegrino 2011 shows that Japanese speakers produce more syllables per second than English speakers, with each syllable carrying less information. The Japanese information rate is the lowest in the seven-language sample, while English is the highest: SR_JA = 7.84 syl/sec, ID_JA = 0.49, IR_JA = 0.74 versus SR_EN = 6.19, ID_EN = 0.91, IR_EN = 1.08, all relative to Vietnamese = 1.00.¹ The 2019 follow-up across seventeen languages finds information rates converging near ~39 bits per second, supporting the "thin packets, normal information" picture.¹⁵

Treating NHK news as a casual-speech reference

NHK reads at roughly 300 字/分 (~5–6 morae per second under the conversion), deliberately slow and clearly articulated, with phrase-boundary silent pauses.⁶⁷ Casual conversation in CSJ runs at 8.01 morae per second.² A learner who can shadow NHK news cleanly is at broadcast pace, not at casual-conversation pace. The gap is real, structural, and consistent with the JLPT's own framing of "natural speed" as a broadcast-pace register.⁸

"Japanese is the fastest language" comes in two strengths

The weak version (Japanese has the highest mean syllabic rate among the seven languages in the Pellegrino-Coupé-Marsico 2011 sample) is well-sourced.¹ The strong version (Japanese speakers literally talk faster than anyone else, anywhere) is not what the data says. The 2011 sample covered Mandarin, English, French, German, Italian, Spanish, and Japanese only. The 2019 follow-up added Basque, Cantonese, Catalan, Finnish, Hungarian, Korean, Serbian, Thai, Turkish, and Vietnamese, and Japanese is no longer reported as the singular outlier at the top in that broader sample.¹⁵ Headlines like "Japanese is the world's fastest language" overstate the 2011 finding.

The phonological-loop metaphor for shadowing

Kadota frames shadowing as exercising the short-term auditory buffer the learner uses for parsing. Shadowing forces that buffer to operate at native rate while the learner produces output simultaneously.²³ "Shadowing is a phonological-loop workout" is academically grounded. It also gives the learner a single sentence to anchor the practice rationale, which is useful for sustaining the protocol through plateaus.

発話速度 is the technical term; 話す速さ is colloquial

Marushima and Maekawa use 発話速度 (hatsuwa sokudo) as the Japanese-linguistics term for speech rate. Measurements are conventionally reported in モーラ毎秒 (morae per second) or 拍/秒.²¹⁴ Japanese pedagogical writing for learners more often uses 話す速さ (hanasu hayasa). The variant 発話テンポ (hatsuwa tenpo) occurs in some prosody papers as a near-synonym. If a Japanese linguistics paper reports "X morae per second," the measurement almost always traces back to CSJ-derived methodology.

References

Pellegrino, François, Christophe Coupé, and Egidio Marsico. "A Cross-Language Perspective on Speech Information Rate." Language, vol. 87, no. 3, 2011, pp. 539–558. Open-access archive: https://gwern.net/doc/cs/algorithm/information/2011-pellegrino.pdf ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³
Maekawa, Kikuo. "Corpus of Spontaneous Japanese: Its Design and Evaluation." Proceedings of the ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR 2003), Tokyo, 2003, pp. 7–12. ISCA Archive: https://www.isca-archive.org/sspr_2003/maekawa03_sspr.html ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰
Kubozono, Haruo. "Mora and Syllable." In Natsuko Tsujimura (ed.), The Handbook of Japanese Linguistics, Blackwell, 1999, pp. 31–61. https://www.blackwellpublishing.com/content/BPL_Images/Content_Store/WWW_Content/9780631234944/002.pdf ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
Vance, Timothy J. The Sounds of Japanese. Cambridge University Press, 2008. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷
Otake, Takashi, Giyoo Hatano, Anne Cutler, and Jacques Mehler. "Mora or syllable? Speech segmentation in Japanese." Journal of Memory and Language, vol. 32, no. 2, 1993, pp. 258–278. ↩ ↩²
矢野香. 『【NHK式＋心理学】一分で一生の信頼を勝ち取る法 ― NHK式7つのルール』. ダイヤモンド社, 2014. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
三本松シゲル. 「なぜＮＨＫのニュースのほうが聞きやすいのか」 Signia Blog, シバントス. https://www.signia.net/ja-jp/blog/local/ja-jp/reasons-why-nhk-news-is-easy-to-hear/ (limitation: industry summary citing [^14], used only to attribute the 300字/分 figure to NHK common practice) ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
国際交流基金 / 日本国際教育支援協会. 「N1〜N5：認定の目安」 日本語能力試験 JLPT. https://www.jlpt.jp/about/levelsummary.html ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
国際交流基金 / 日本国際教育支援協会. 『新しい「日本語能力試験」ガイドブック』. 2009. https://www.jlpt.jp/e/reference/pdf/guidebook1.pdf ↩ ↩² ↩³ ↩⁴
Tamai, Ken'ichi (玉井健). "シャドーイングはリスニング能力を高めるか." Time Studies (時事英語学研究), no. 31, 1992. Cited in Hamada 2018. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
Hamada, Yo. Teaching EFL Learners Shadowing for Listening: Developing Learners' Bottom-up Skills. Routledge, 2018. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
Maekawa, Kikuo, Hanae Koiso, Sadaoki Furui, and Hitoshi Isahara. "Spontaneous Speech Corpus of Japanese." Proceedings of the Second International Conference on Language Resources and Evaluation (LREC 2000), Athens, 2000, pp. 947–952. ↩
国立国語研究所. 『日本語話し言葉コーパス』(Corpus of Spontaneous Japanese, CSJ). Database overview page. https://clrd.ninjal.ac.jp/csj/ ↩
丸島歩. 「発話速度の認知に関する一考察 ― 基本周波数変動との関連性に注目して ―」 言語学論叢オンライン版 創刊号 (通巻 27 号), 筑波大学, 2008, pp. 70–84. http://www.lingua.tsukuba.ac.jp/ippan/TWPL0/TWPL01_27/7_marushima.pdf ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
Coupé, Christophe, Yoon Mi Oh, Dan Dediu, and François Pellegrino. "Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche." Science Advances, vol. 5, no. 9, 2019, eaaw2594. https://www.science.org/doi/10.1126/sciadv.aaw2594 ↩ ↩² ↩³ ↩⁴
Bloch, Bernard. "Studies in Colloquial Japanese IV: Phonemics." Language, vol. 26, no. 1, 1950, pp. 86–125. ↩
Han, Mieko Shimizu. "The feature of duration in Japanese." Onsei no Kenkyū (音声の研究), vol. 10, 1962, pp. 65–80. ↩
Pike, Kenneth L. The Intonation of American English. University of Michigan Press, 1945. ↩
Warner, Natasha, and Takayuki Arai. "Japanese Mora-Timing: A Review." Phonetica, vol. 58, no. 1–2, 2001, pp. 1–25. ↩ ↩² ↩³
前川喜久雄. 『日本語話し言葉コーパス』の概観 Version 2.0. 国立国語研究所. https://clrd.ninjal.ac.jp/csj/manu-f/overview.pdf ↩
Iwamoto, Kanae, Hideaki Kikuchi, and Reiko Mazuka. "Speech rate development in Japanese-speaking children and proficiency in mora-timed rhythm." Journal of Experimental Child Psychology, vol. 220, 2022, 105411. https://pubmed.ncbi.nlm.nih.gov/35349950/ ↩
Uther, Maria, Monja A. Knoll, and Denis Burnham. "Do you speak E-NG-L-I-SH? A comparison of foreigner- and infant-directed speech." Speech Communication, vol. 49, no. 1, 2007, pp. 2–7. ↩ ↩²
Kadota, Shuhei (門田修平). Shadowing, Oral Reading, and Acquisition of English (『シャドーイング・音読と英語習得の科学』). Cosmopier Publishing, 2007. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹
Hamada, Yo. "Developing a New Shadowing Procedure for Japanese EFL Learners." RELC Journal, vol. 53, no. 1, 2022, pp. 137–151. https://journals.sagepub.com/doi/abs/10.1177/0033688220937628 ↩ ↩² ↩³ ↩⁴

Overview​

What "speech rate" actually measures​

The three-tier scale this article will defend​

Why this matters for the learner​

How fast is Japanese, really​

The 7.84 syllables-per-second finding​

Morae vs. syllables: the same data through a Japanese-native lens​

A note on the conversion check​

NHK news and other broadcast registers​

JLPT listening: where the test sits on the scale​

The information-rate equilibrium​

Why Japanese feels faster than its information rate​

Mora-timing as a metronome​

Open syllables and the absence of consonant clusters​

No reduced vowels, but devoiced vowels still pass​

Sentence-final compression and the "wall of sound" effect​

Pitch-flat speech without rest points​

What slows speech down (and what does not)​

Register: keigo and customer-service speech​

Genre: news vs. drama vs. comedy vs. variety​

Speaker variables: age, region, individual range​

What does not slow speech down: speaking to a foreigner​

Closing the gap: training to native rate​

The shadowing premise​

Pick a tier, not a clip​

The three shadowing modes​

Track morae per second, not minutes per day​

Listening (vs. shadowing) at the next tier up​

The plateau and the variety-show test​

Good to know​

Counting morae as if they were syllables​

Confusing "fastest language" with "highest information rate"​

Treating NHK news as a casual-speech reference​

"Japanese is the fastest language" comes in two strengths​

The phonological-loop metaphor for shadowing​

発話速度 is the technical term; 話す速さ is colloquial​

See also​

References​

Footnotes​