Skip to main content

Why "Tokyo" Is Two Syllables in English and Four Morae in Japanese: Loanwords as a Timing Drill

Why Tokyo is four morae in Japanese but two syllables in English comes down to a unit mismatch. Japanese counts beats by mora, while English counts beats by syllable. When words travel between the two languages, they often arrive with a beat count an English ear has never heard.12 This article counts the morae in Tokyo and a handful of other words English speakers already say, then turns the gap into a daily timing drill.

Overview

The promise in one sentence

Words you already say in English carry a hidden Japanese mora count. Tokyo has four morae in Japanese (と・う・きょ・う) and two syllables in standard English; that four-versus-two gap is the worked headline of this article.23 Karaoke, Pokémon, judo, and other words borrowed back into English show the same pattern.

Why this gap matters for a beginner

A mora is a timing unit, not a syllable. Japanese is mora-timed: each mora fills its own slot in time. English is stress-timed: unstressed material compresses. Treating Japanese words with English timing is the single most common source of foreign-accented pronunciation for English first-language learners. The loanwords you already know are the easiest place to feel the difference.14

Who this is for

This article is for an absolute beginner, pre-N5 through N5. You have met hiragana and heard the word mora, but you cannot yet hear long vowels or the moraic nasal ん in real time.5 Measured data backs the difficulty: elementary British learners of Japanese produce two-mora words at 86.5% of the duration of three-mora words, while native speakers produce them at 73.2%. In other words, English first-language learners compress the moraic gap.5 The gap is a pattern, not a stereotype.

What a mora is (one-minute version)

A mora is a sub-syllabic unit of timing. The duration of a Japanese utterance scales linearly with how many morae it contains; this is the foundational mora-timing finding.4 Mora isochrony, or near-equal mora timing, is approximate rather than strict. Even so, the trend is strong enough that Japanese speakers themselves count by mora when they compose haiku in 5-7-5.2

One kana, one beat, with three additions

In Japanese, each non-yōon kana is one mora.12 On top of that base rule, three "special morae" (特殊拍 tokushuhaku) each count as a full mora, even though they do not carry their own syllable vowel:1267

Special moraSymbolCounts asExample
Long vowel (chōon)ー / repeat1 moraとう = と + う (2 morae); カー = カ + ー (2 morae)
Moraic nasalん / ン1 moraほん = ほ + ん (2 morae)
Geminate (sokuon)っ / ッ1 moraきって = き + っ + て (3 morae)

Yōon (拗音), or contracted sounds, are the one place where two kana fuse into a single mora. A non-palatal Ci-kana, such as a kana in the i-row, plus small ゃ/ゅ/ょ (きゃ, しゅ, きょ) is one mora, not two.78 In 東京, きょ is a single yōon mora.

One kana, one tap, unless the kana is small

Each full-size kana is one tap. Small ゃ/ゅ/ょ attach to the previous kana to make one yōon mora. Small っ is its own silent tap. Chōon (the second う in とう, the second い in いい, the ー in カー) is its own tap.79

At normal speech rate, a Japanese long vowel measures roughly 138 ms and a short vowel roughly 86 ms. That is a long-to-short ratio near 1.6.10 Absolute values shift with speaking speed, but the ratio holds.

Why English speakers undercount

English is stress-timed: spacing falls between stressed syllables, and unstressed material compresses to fit. Japanese is mora-timed: each mora gets its own slot. An English ear trained to compress unstressed material treats long vowels, ん, and っ as "nothing" because English has no equivalent contrast.14

Measured data confirms the bias: British learners whose first language is English have smaller relative spacing between two-mora and three-mora words than native speakers do. That is exactly what syllable-counting carry-over predicts.5

Counting Tokyo: the worked example

English "Tokyo" as two syllables

The standard English pronunciation /ˈtoʊ.kioʊ/ (commonly transcribed TOH-kyoh) has two syllables for most speakers. It sometimes has three when /kjoʊ/ is broken into /ki.oʊ/.11 The two-syllable form is the usual one.

The written form Tokyo, the passport spelling, strips the macrons that Hepburn writes over both long vowels.1213 The spelling hides the two beats that the Japanese form carries. An English reader has no spelling cue telling them where those long vowels are.

Japanese 東京 (とうきょう) as four morae

The kana breakdown of 東京 is と・う・きょ・う, four morae. きょ is one yōon mora because the small ょ fuses with き. Each う is a separate chōon mora lengthening the preceding /o/.378

東京とうきょう3
"Tokyo."

A learner-friendly way to see the mismatch is to lay the two counts side by side.

The diagram shows the article's main point in miniature: same word, two units, two different counts.

Standard Tokyo Japanese gives 東京 a heiban (type-0) pitch contour: flat, with no drop. The first mora is low, and the remaining morae plus any following particle are high.14153 Pitch is a separate layer from mora timing. The four equal-time beats sit underneath whichever contour the dialect assigns.

東京駅とうきょうえきいましょう。14
"Let's meet at Tokyo Station."

東京とうきょう日本にほん首都しゅとです。14
"Tokyo is the capital of Japan."

Why the English form lost two beats

There are three romanizations of the same word. Each one tells the learner something different about the morae:1213

NotationSpellingWhat it shows
Modified HepburnTōkyōBoth long vowels marked with macrons
Wāpuro / Kunrei-shikiToukyouBoth long vowels written as doubled vowels
Passport / road signTokyoBoth long vowels invisible

All three encode the same とうきょう. None of them is wrong. The bare Tokyo spelling is the only one that hides the morae. It is also the spelling almost every English speaker meets first.

Macron-stripped spellings are not incorrect English

ALA-LC romanization and standard English usage drop macrons for naturalized words and famous place names. Tokyo, judo, and sumo are the conventional English forms. Tōkyō, jūdō, and sumō are the formally correct Modified Hepburn variants. Both spellings coexist for different audiences.1213

More worked loanwords

Karaoke: 3 English syllables vs 4 Japanese morae

カラオケ has four morae: カ・ラ・オ・ケ. There are no long vowels, no ん, and no っ; the mismatch is purely a syllable-count compression in English.16 Standard Tokyo pitch is heiban (type 0).1416

カラオケにきませんか。14
"Want to go to karaoke?"

The word is wasei-eigo, or Japanese-made English: a Japanese compound of kara 空 ("empty") + oke (clipped from オーケストラ, "orchestra"). It was coined in 1970s Japan and borrowed back into English from around 1979.1617 The common English pronunciation /ˌkæriˈoʊki/ collapses /ka/ + /ra/ into "carry" and shifts stress to the third syllable. That produces three English syllables for four Japanese morae.11

Pokémon: 3 English syllables vs 4 Japanese morae

ポケモン has four morae: ポ・ケ・モ・ン. The final ん is its own mora: a full beat, not a coda riding on /mo/.1812

ポケモンがきです。14
"I like Pokémon."

The etymology is wasei-eigo again: a clipped compound of ポケット (poketto, "pocket") + モンスター (monsutā, "monster"). The franchise debuted in Japan in 1996 and in English in 1998.18 English /ˈpoʊ.keɪ.mɒn/ assigns three syllables (POH-kay-mon). ん is absorbed into the final English syllable instead of getting its own beat.11

ん is not the English "n" stuck onto the previous vowel

The moraic nasal ん has several allophones (uvular, alveolar, bilabial, velar) depending on what follows it. Said alone, as in ポケモン at the end of a sentence, ん is typically realized with a uvular or velar closure made toward the back of the mouth, not the alveolar [n] English speakers default to. It is also a full mora with its own beat in time.1

Judo: 2 English syllables vs 4 Japanese morae

じゅうどう (柔道) has four morae: じゅ・う・ど・う. じゅ is one yōon mora; each う is a chōon mora lengthening the preceding vowel.12717 The English spelling judo collapses both long vowels and renders the word as two syllables.

柔道じゅうどうならっています。14
"I'm learning judo."

The etymology is 柔 ("soft, gentle") + 道 ("way"). The word naturalized into English without macrons by the early twentieth century.17 The same shape recurs across Japanese vocabulary: a two-kanji compound where every kanji carries a long-vowel mora. English routinely flattens it.

Bonus passes: sake, sumo, anime, Honshu, Shinjuku

Five more words show where the mora count matches and where it does not:

WordKanaMoraeEnglish syllablesWhat English loses
酒 sakeさ・け22 ("sake")Final vowel quality (/e/ to /i/), not mora count1417
相撲 sumōす・も・う32 ("sumo")The chōon う lengthening /o/12
アニメ animeア・ニ・メ33 ("anime")Vowel qualities only; mora count matches19
本州 Honshūほ・ん・しゅ・う42 ("Honshu")Both ん and the chōon う127
新宿 Shinjukuし・ん・じゅ・く43 ("Shinjuku")The ん rides on /ʃɪn/127

本州ほんしゅう日本にほんしまです。14
"Honshu is an island of Japan."

The pattern across the table is the same one Tokyo, karaoke, Pokémon, and judo show. When the Japanese form carries a chōon, an ん, or both, the English form drops the special-mora beats and keeps only the obvious vowel syllables.

How to retrain the timing

Mora timing is slow to install because the English stress-timed pattern is overlearned. The shadowing literature reports measurable pronunciation gains within 2–4 weeks of daily 10–20 minute practice. Even advanced English first-language learners narrow the gap with native durations but do not fully close it.520 The three drills below come from established pronunciation-training practice and move from the most mechanical to the most generative.

Drill 1: count beats by tapping

Tap one finger per mora while reading a printed kana word at a steady pace, around 100–150 ms per tap. Equal time per tap matters more than the exact tempo. Tanaka and Kubozono's pronunciation teaching text recommends explicit mora counting and beat tapping as the foundational step for learners whose first language is not mora-timed.4109

Four taps, equal time. If two of the taps want to merge, that is the English habit speaking. Start the word over.

Drill 2: minimal-pair pairs that hurt to miss

A small set of minimal pairs brings the contrast into focus because the wrong duration produces the wrong word.

Short formLong formWhat changes
おばさん "aunt"おばあさん "grandmother"One chōon mora (3 vs 4 morae)10
おじさん "uncle"おじいさん "grandfather"One chōon mora (3 vs 4 morae)9
ここ "here"こうこう "high school"Two chōon morae (2 vs 4 morae)9
きて "come (te-form)"きって "stamp"One sokuon mora (2 vs 3 morae)17

Record yourself reading each pair. Compare it against a native recording, and keep going until the duration ratio between the two columns matches.

Drill 3: re-romanize your loanword list

Take every English word you know that comes from Japanese, write its kana spelling beside it, and re-romanize the kana into wāpuro form: TokyoToukyou, judojuudou, HonshuHonshuu, sumosumou.1213 If the wāpuro form has more letters than the passport form, the extra letters mark hidden morae you have been dropping. Add each such word to the daily tap-and-shadow list until the longer form feels natural.

How long before this sticks

Two to four weeks of daily 5–10 minute drilling brings noticeable timing change on familiar words. Unfamiliar vocabulary takes longer because the first-language stress-timed pattern reasserts itself whenever the brain is busy retrieving meaning.520 The goal is not to count beats forever. It is to build enough timing reflex that the count happens without thought.

Good to know

Tokyo, Tōkyō, and Toukyou are the same word

The three forms are three notations for one Japanese form. Modified Hepburn writes the macron (Tōkyō). Wāpuro or Kunrei-shiki doubles the vowel (Toukyou). The road-sign and passport variant strips both diacritics and the doubled vowel (Tokyo).

All three encode the same とうきょう, and none of them is wrong in its own context.1213 Only the bare Tokyo hides the morae from an English reader.

English stress is not Japanese timing

The English form Tokyo takes lexical stress on the first syllable (TOH-kyoh). The Japanese form 東京 gives each mora equal time and does not stack English-style stress.1415 Importing the English stress pattern produces a foreign-accented Japanese pronunciation. The first step is to give every mora its own slot, with no compression on the unstressed beats.

The silent っ is also a mora

The sokuon (っ) is a hold mora: a full mora-length silence or closure that English routinely ignores. kite (きて, 2 morae, "come") and kitte (きって, 3 morae, "stamp") are different words. The third mora in kitte is the silent っ. Learners undertime it because English has no comparable contrast.17 The correct form holds the closure on /t/ for one mora before release:

切手きっていました。14
"I bought a stamp."

Some "Japanese" English words hide morae

Tycoon (1857 in English) is from 大君 taikun ("great lord"), a title used for the Tokugawa shogun in foreign correspondence. たいくん has 4 morae, and English flattens the long /uː/.2117

Honcho (1947, U.S. military English) is from 班長 hanchō ("squad leader"). はんちょう has 4 morae (は・ん・ちょ・う). English collapses it to 2 syllables, dropping both the moraic ん and the chōon う.2217

Umami was coined in 1908 by chemist Ikeda Kikunae at Tokyo Imperial University. うまみ has 3 morae. English borrowed it with the mora count preserved, then flattened the vowel qualities in actual pronunciation.23

Kyoto and Osaka hide long vowels too

Kyoto (京都, Kyōto) has 4 morae (きょ・う・と) but only 2 English syllables. The first chōon is dropped. Osaka (大阪, Ōsaka) has 4 morae (お・お・さ・か) but only 3 English syllables. The initial chōon is dropped.1213 The pattern is identical to Tokyo: the passport spelling strips the long-vowel marker, and the English pronunciation drops the mora.

Tokyo means "eastern capital" and dates from 1868

東 (east) + 京 (capital). The name marks the 1868 imperial proclamation that renamed Edo and moved the capital from Kyōto to the new eastern capital.324 The four-mora pronunciation has been stable since then.

See also

References

Footnotes

  1. Vance, Timothy J. The Sounds of Japanese. Cambridge University Press, 2008. 2 3 4 5 6 7 8 9

  2. Kubozono, Haruo. "Mora and Syllable." The Handbook of Japanese Linguistics, edited by Natsuko Tsujimura, Blackwell, 1999, pp. 31–61. https://onlinelibrary.wiley.com/doi/10.1002/9781405166225.ch2 2 3 4 5 6

  3. Wiktionary, s.v. "東京." https://en.wiktionary.org/wiki/%E6%9D%B1%E4%BA%AC 2 3 4 5

  4. Port, Robert F., Jonathan Dalby, and Michael O'Dell. "Evidence for mora timing in Japanese." Journal of the Acoustical Society of America, vol. 81, no. 5, 1987, pp. 1574–1585. 2 3 4

  5. Nagai, Katsumi. "Mora Timing by British Learners of Japanese." Kagawa University. https://www.ed.kagawa-u.ac.jp/~nagai/papers/kn5/kn5.htm 2 3 4 5

  6. Kubozono, Haruo, editor. Handbook of Japanese Phonetics and Phonology. De Gruyter Mouton, 2015. https://archive.org/details/handbooks-of-japanese-language-and-linguistics-hjll

  7. 大辞林, 第三版. 三省堂. (entries: 拗音, 撥音, 促音, 長音) 2 3 4 5 6 7 8 9

  8. Tsurutani, Chiharu. "Acquisition of yō-on (Japanese contracted sounds) in L1 and L2 phonology." Second Language Research, vol. 23, no. 4, 2007, pp. 397–415. 2

  9. Tanaka, Shin'ichi, and Haruo Kubozono. Introduction to Japanese Pronunciation: Theory and Practice. Kurosio Publishers, 1999. 2 3 4

  10. Hirata, Yukari. "Effects of speaking rate on the vowel length distinction in Japanese." Journal of Phonetics, vol. 32, no. 4, 2004, pp. 565–589. 2 3

  11. "Words of Japanese origin." Oxford English Dictionary blog. https://www.oed.com/discover/words-of-japanese-origin/ 2 3

  12. Hepburn, James Curtis. A Japanese and English Dictionary; with an English and Japanese Index, 3rd edition. Z.P. Maruya & Co., 1886. (Modified Hepburn as adopted in Kenkyūsha's New Japanese-English Dictionary, 3rd ed., 1954.) 2 3 4 5 6 7 8 9 10

  13. Library of Congress. ALA-LC Romanization Tables: Japanese. https://www.loc.gov/catdir/cpso/romanization/japanese.pdf 2 3 4 5 6

  14. NHK 放送文化研究所, editor. 『NHK 日本語発音アクセント新辞典』. NHK出版, 2016. 2 3 4 5 6 7 8 9 10 11

  15. 東京大学大学院工学系研究科 峯松・齋藤研究室. Online Japanese Accent Dictionary (OJAD). https://www.gavo.t.u-tokyo.ac.jp/ojad/eng/pages/home 2

  16. Wiktionary, s.v. "カラオケ." https://en.wiktionary.org/wiki/%E3%82%AB%E3%83%A9%E3%82%AA%E3%82%B1 2 3

  17. Online Etymology Dictionary, s.v. "tycoon," "honcho," "judo," "Pokemon." https://www.etymonline.com 2 3 4 5 6

  18. Wiktionary, s.v. "ポケモン." https://en.wiktionary.org/wiki/%E3%83%9D%E3%82%B1%E3%83%A2%E3%83%B3 2

  19. Wiktionary, s.v. "アニメ." https://en.wiktionary.org/wiki/%E3%82%A2%E3%83%8B%E3%83%A1

  20. Tsutsui Billins, Marie. Japanese Pronunciation Practice Through Shadowing. Nipponrama, 2020. 2

  21. Oxford English Dictionary, "tycoon, n." Oxford University Press. https://www.oed.com

  22. Oxford English Dictionary, "honcho, n." Oxford University Press. https://www.oed.com

  23. Oxford English Dictionary, "umami, n." Oxford University Press. https://www.oed.com

  24. National Diet Library Japan. 国立国会図書館. Reference for Edo→Tokyo renaming, 1868 imperial proclamation. https://www.ndl.go.jp