Skip to main content

Long vs. Short Vowels in Japanese: The Distinction Beginners Miss

Japanese long vs short vowels are a phonemic contrast. Holding a vowel for one extra beat selects a different dictionary word, not a more emphatic version of the same word.12 For an N5 learner who has already met おばさん and おばあさん in a textbook list, that one extra mora is the difference between calling someone an aunt and calling them a grandmother.34

Overview

Japanese has five vowel phonemes, /a, i, u, e, o/. Every one contrasts short against long.12 The long member is not a stretched-out short. It is a separate phoneme that occupies two morae, the rhythmic units that drive Japanese timing.2

In connected speech, a long vowel lasts roughly 2.5 to 3 times as long as its short partner. That ratio is preserved when speakers talk faster or slower.5 The contrast survives speech-rate changes because both members scale together.

Four moving parts to keep separate

Vowel length involves four separate pieces: (a) the phoneme (the sound), (b) the mora count (the timing), (c) the orthography (which kana spell the long form: おう vs おお, ええ vs えい, the katakana chōonpu ー), and (d) perception (whether an English-trained ear catches the duration). This article covers the phoneme, the mora count, and perception. Orthography is covered in a sibling article.

Vowel length is phonemic: changing length changes the word

What "phonemic" means in plain terms

A contrast is phonemic when swapping one feature for another changes which word is being said. Japanese vowel length passes that test. The consonants are identical, the vowel quality is identical, only the duration differs, and the dictionary entry changes.12

Attested minimal pairs, word pairs that differ in only one sound feature, span all five vowels: obasan (aunt) / obāsan (grandmother),34 hiru (leech) / hīru (heel), kegen (dubious) / keigen (reduction), tokai (city) / tōkai (destruction), and ku (district) / (void).2

おばさんは元気げんきです。4
"(My) aunt is well."

おばあさんは元気げんきです。3
"(My) grandmother is well."

The two sentences differ by a single mora. The dictionary entries they point at are two generations apart.34

おじさんはアメリカじんです。6
"(My) uncle is American."

おじいさんはアメリカじんです。7
"(My) grandfather is American."

Infants acquiring Japanese begin to notice vowel length between four and 9.5 months. By around 18 months, they treat duration as a phonological cue, not just a raw acoustic one.8 That developmental pattern is what a phonemic, language-specific contrast looks like, as opposed to a universal acoustic difference any baby could hear.

Why this is not "emphasis" or "drawing out a word"

English duration is a prosodic resource: a longer vowel signals stress, emphasis, or affect ("I'm sooo tired"), not a different word. Japanese duration is lexical: longer means a different word.18

Japanese still has expressive lengthening for affect, written in informal kana with the chōonpu ー. But it is optional and sits above the word's basic sound pattern, so the underlying word is unchanged.1 The contrast below is the same adjective in two emotional registers.

さむい!9
"(It's) cold!"

さむーい!1
"(It's) cooold!"

Lexical length is not the same channel as expressive length

The さむーい above stretches the い for emphasis and does not change the word. The おばさん / おばあさん contrast above works differently: the long vowel is part of the word's stored form, and shortening it picks a different lexical entry. English speakers tend to assume any vowel stretch is the first kind. In Japanese, both exist, and they use separate channels.

How to count: a long vowel is two morae

One kana, one beat: counting おばさん vs おばあさん

Japanese is mora-timed: the mora is the rhythmic unit. Kana spelling is a near-perfect mora alphabet, where each full-size kana corresponds to one mora.10 A mora is a beat of timing, not a syllable.

Counting kana gives the mora count. お-ば-さ-ん is 4 morae. お-ば-あ-さ-ん is 5 morae.102 The extra あ is the second mora of the long vowel /aː/. It is not a silent or decorative letter; it pays for itself in timing.

これはおばさんです。4
"This is (my) aunt." (4 morae: お-ば-さ-ん)

これはおばあさんです。3
"This is (my) grandmother." (5 morae: お-ば-あ-さ-ん)

Tap one finger per kana

A physical mora tap, one finger on the table for each kana, including the long-vowel second mora, makes the count external. It lets the 4-beat versus 5-beat difference register in the hand before it has to register in the ear.510 The technique uses the same mora-isochrony principle that defines mora-timed languages.

Elementary learners whose first language is British English systematically under-produce the durational difference. Experimental data put them at an 86.5% short-to-long ratio against native speakers' roughly 75%. In other words, learners shrink the long form toward the short one even when they think they have produced the contrast.10

The five long-vowel slots: aa, ii, uu, ee, oo

Each of the five vowels has a long counterpart that contributes a second mora. The table below maps each vowel to a typical hiragana spelling and one minimal-pair anchor.

Long vowelHiragana realization (typical)N5/N4 exampleMora countShort partnerSource
/aː/あ + あおばあさん "grandmother"5おばさん (4)34
/iː/い + いおじいさん "grandfather"5おじさん (4)76
/uː/う + うくうき "air"3くき "stem" (2)1112
/eː/え + い (most) / え + え (some native)せんせい "teacher"4(no minimal pair at N5)1
/oː/お + う (most) / お + お (some native)こうこう "high school"4ここ "here" (2)11314

For /eː/ and /oː/, the most common spelling uses い and う as the second mora (せんせい, こうこう, とうきょう). A closed list of native words instead uses ええ and おお (おねえさん, とおい, おおきい).1 The spelling difference does not change the sound; both spell a single long vowel.

Orthography is a separate problem

This article handles the sound and the count. Spelling follows orthographic rules covered in dedicated articles: see Long Vowels in Hiragana for おう vs おお and ええ vs えい, and Long Vowels in Katakana for the chōonpu ー.1 You can recognize a long vowel by ear and by mora count without yet knowing why こう is written one way and こお another.

The most-confused pairs beginners must hear

Family-vocab pairs: おばさん / おばあさん, おじさん / おじいさん

The two kinship pairs are the standard textbook introduction to phonemic vowel length in English-language Japanese pedagogy.9 They are core N5 vocabulary,3476 keep every other segment identical, and carry a meaning gap (aunt vs grandmother, uncle vs grandfather) that students immediately recognize as socially costly to miss.

A single missed mora moves the person you are referring to by two kinship generations: short = parent's sibling, long = grandparent.3476

おばさんとおばあさんはちがいます。43
"Aunt and grandmother are different (people)."

おじさんとおじいさんはちがいます。67
"Uncle and grandfather are different (people)."

Single-character-different content words: ゆき / ゆうき, くき / くうき, ここ / こうこう

Beyond kinship terms, common-word minimal pairs appear across the vocabulary. Each entry below is a real dictionary headword, not a contrived example: ゆき "snow" (N5) versus ゆうき "courage" (N3),1516 くき "stalk" (N1) versus くうき "air" (N4),1112 and ここ "here" (N5) versus こうこう "high school" (N4).1314

ゆきがふっています。15
"It is snowing."

勇気ゆうきがあります。16
"(He/She) has courage."

ここでまってください。13
"Please wait here."

高校こうこう日本語にほんごをならいました。14
"I learned Japanese in high school."

Loanword and proper-noun traps: 時計 vs 統計, 東京 in romaji

時計 (とけい, "clock") and 統計 (とうけい, "statistics") are a real-word minimal pair: the consonants are the same, and only the first vowel's length differs.1718 とけい is N5 vocabulary; とうけい is N2 vocabulary.

時計とけい十時じゅうじです。17
"The clock says ten o'clock."

統計とうけい勉強べんきょうします。18
"(I) study statistics."

東京 (とうきょう, "Tokyo") has four morae (と-う-きょ-う), not the two ("toh-kyo") that the unmarked English romaji "Tokyo" suggests.19 Its standard spoken form is とうきょう /toːkjoː/, with a long /oː/ in both the first and second halves.

Unmarked English-language romaji, the form most beginners meet first, systematically drops macrons from loan-back place and personal names ("Tokyo," "Osaka," "Kyushu," "sumo"). That hides the long vowel and builds the mistake into English speakers' mental representations of the words.1 Modified Hepburn with macrons (Tōkyō, Ōsaka, Kyūshū, sumō) preserves the contrast.1

東京とうきょうにすんでいます。19
"(I) live in Tokyo." (4 morae: と-う-きょ-う)

Why English speakers often fail to hear it

English uses duration as a stress cue, not a phoneme

In English, vowel duration varies with stress and with tense/lax vowel quality, but it does not by itself distinguish phonemes.18 English minimal pairs are built on vowel quality (bit / beat, full / fool), not on duration alone.

Because English listeners' phonological grammar treats duration as prosody, they tend to hear a Japanese long vowel as a stressed version of the same word. On first exposure, they miss the lexical contrast.810 The mistake is structural, not laziness. The L1 filter is doing its assigned job and routing duration to the wrong category.

Mora-timing reorders that priority

Japanese rhythm is mora-timed: word duration scales with mora count, not with syllable count.510 A long-vowel mora is rhythmically equivalent to any other mora. That is why a 5-mora word like おばあさん is reliably longer than a 4-mora word like おばさん in connected speech.

The long-vowel mora's extra beat is the same kind of timed unit as the geminate-consonant mora (the silent gap in きって) and the mora-N (the ん in しんぶん). All three are special morae that a mora-timed grammar counts.10

What native-Japanese infants do that adult learners must redo

Infants acquiring Japanese are sensitive to the short/long vowel distinction by about 9.5 months. They treat duration as a phonological cue by about 18 months; the contrast is present in infant-directed speech with duration as the consistent cue.8

Adult L2 learners arrive with their L1 phonological filter already in place. Explicit minimal-pair training is what re-categorizes duration from prosodic to lexical in the learner's mental phonology.10 Without that retraining, even advanced English-L1 learners under-produce long vowels in measurable ways: an 86.5% short-to-long ratio against native speakers' roughly 75%.10

How to train your ear and mouth

Tap the morae while you say the word

A physical mora-tap, one finger-tap per kana, makes the count external. It brings the difference between 4-mora おばさん and 5-mora おばあさん into the hand before it has to register in the ear.510 The technique uses the same isochrony-of-morae principle that defines mora-timed languages.

Drill with minimal pairs, not isolated words

Contrast pairs (obasan / obāsan, yuki / yūki, koko / kōkō) force the brain to treat duration as the only variable.10 Drilling one word in isolation does not train discrimination, because nothing in the input forces the learner's category boundary to sharpen.

L2 learners who improve on phonemic length contrasts do so after targeted exposure that pairs perception and production with feedback. This is the protocol implicit in the pedagogy literature on Japanese phonemic length acquisition.10

Record-and-compare with a native reference

Comparing your own recording to a native model helps you notice the durational gap your L1 filter is quietly closing.10 The cleanest objective check is the 4-mora versus 5-mora total word duration: if the recorded "obāsan" is not noticeably longer than "obasan," the long vowel is being collapsed.

Good to know

Long vowels are not "stressed" in Japanese

Adding a pitch peak or extra loudness to the long mora is an English-L1 reflex that does not exist in standard Japanese. Vowel length and pitch accent are independent contrasts.1 A long vowel may or may not carry the accent kernel, depending on the word's pitch-accent class, and there is no consistent rule that long vowels attract stress.

Long vowels and geminate consonants are both two-mora special beats

The long-vowel mora (the second half of おばあさん's /aː/) and the geminate-consonant mora (the silent gap in きって) are both special morae. Each carries a beat without contributing a full CV syllable.10 They sit in different slots in the mora template (vocalic vs consonantal) but follow the same mora-timing principle. The same family includes the mora-N (the ん in しんぶん), the third non-segmental mora.

Devoicing of a short vowel is not the same as a long vowel

High vowels /i/ and /u/ between voiceless consonants, or word-finally after a voiceless consonant, are routinely devoiced in standard Tokyo Japanese (です sounds like des, すき sounds like ski).1 Devoicing reduces a short vowel's audibility. It never turns a long vowel /uː/ or /iː/ into a short one, because the second mora is still timed even when the vowel is whispered. "The う is silent" (devoicing of a short vowel) and "the う lengthens the previous vowel" (chōon) are two different stories that share a kana.

Romaji traps: Tokyo, Osaka, sumo

Standard English spelling of Japanese place and personal names systematically omits the long-vowel signal: Tokyo for Tōkyō,19 Osaka for Ōsaka, sumo for sumō, judo for jūdō.1 This is not a transliteration error, but a conventional simplification. It conceals the lexical fact from English-speaking learners. The resulting two-syllable English pronunciation ("toe-kyo") encodes the wrong mora count.

See also

References

Footnotes

  1. Vance, Timothy J. The Sounds of Japanese. Cambridge University Press, 2008. Chapter 2 ("Phonemics") and Chapter 3 ("Vowels"), which cover short vowels, long vowels, vowel sequences, and the phonemic-contrast / minimal-pair framework. ISBN 9780521617543. 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

  2. "Japanese phonology." Wikipedia. Quotes the long-vs-short minimal pairs obasan / obāsan, kegen / keigen, hiru / hīru, tokai / tōkai, ku / and the formulation "All vowels display a length contrast: short vowels are phonemically distinct from long vowels." https://en.wikipedia.org/wiki/Japanese_phonology (limitation: tertiary; used only to triangulate pair lists also attested in Vance 2008 1) 2 3 4 5 6

  3. Jisho.org entry for おばあさん (お祖母さん / お婆さん). Reading おばあさん, meaning "grandmother; old woman," JLPT N5, common word. https://jisho.org/search/%E3%81%8A%E3%81%B0%E3%81%82%E3%81%95%E3%82%93 2 3 4 5 6 7 8 9

  4. Jisho.org entry for おばさん (伯母さん / 叔母さん / 小母さん). Reading おばさん, meaning "aunt; middle-aged woman," JLPT N5, common word. https://jisho.org/search/%E3%81%8A%E3%81%B0%E3%81%95%E3%82%93 2 3 4 5 6 7 8 9

  5. Hirata, Yukari. "Effects of speaking rate on the vowel length distinction in Japanese." Journal of Phonetics 32(4), 565–589 (2004). https://www.sciencedirect.com/science/article/abs/pii/S0095447004000282 2 3 4

  6. Jisho.org entry for おじさん (伯父さん / 叔父さん / 小父さん). Reading おじさん, meaning "uncle; middle-aged man," common word. https://jisho.org/search/%E3%81%8A%E3%81%98%E3%81%95%E3%82%93 2 3 4 5

  7. Jisho.org entry for おじいさん (お祖父さん / お爺さん). Reading おじいさん, meaning "grandfather; old man," common word. https://jisho.org/search/%E3%81%8A%E3%81%98%E3%81%84%E3%81%95%E3%82%93 2 3 4 5

  8. Bion, Ricardo A. H.; Miyazawa, Kouki; Kikuchi, Hideaki; Mazuka, Reiko. "Learning Phonemic Vowel Length from Naturalistic Recordings of Japanese Infant-Directed Speech." PLoS ONE 8(2): e51594. doi:10.1371/journal.pone.0051594. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0051594 2 3 4 5

  9. Banno, Eri; Ikeda, Yoko; Ohno, Yutaka; Shinagawa, Chikako; Tokashiki, Kyoko. Genki: An Integrated Course in Elementary Japanese, 3rd ed. Vol. 1. The Japan Times, 2020. Greetings and Lesson 1 vocabulary lists, where おばあさん, おじいさん, おばさん, おじさん, and 時計 first appear; pronunciation appendix on long vowels. 2

  10. Nagai, Katsumi. "Mora Timing by British Learners of Japanese." Osaka University Graduate School of Language and Culture (Kagawa University mirror). https://www.ed.kagawa-u.ac.jp/~nagai/papers/kn5/kn5.htm 2 3 4 5 6 7 8 9 10 11 12 13 14

  11. Jisho.org entry for 茎 (くき). Reading くき, meaning "stalk; stem," JLPT N1, common word. https://jisho.org/word/%E8%8C%8E 2

  12. Jisho.org entry for 空気 (くうき). Reading くうき, meaning "air; atmosphere," JLPT N4, common word. https://jisho.org/word/%E7%A9%BA%E6%B0%97 2

  13. Jisho.org entry for ここ (此処). Reading ここ, meaning "here; this place," JLPT N5, common word. https://jisho.org/search/%E3%81%93%E3%81%93 2 3

  14. Jisho.org entry for 高校 (こうこう). Reading こうこう, meaning "senior high school," JLPT N4, common word. https://jisho.org/word/%E9%AB%98%E6%A0%A1 2 3

  15. Jisho.org entry for 雪 (ゆき). Reading ゆき, meaning "snow," JLPT N5, common word. https://jisho.org/word/%E9%9B%AA 2

  16. Jisho.org entry for 勇気 (ゆうき). Reading ゆうき, meaning "courage; bravery," JLPT N3, common word. https://jisho.org/word/%E5%8B%87%E6%B0%97 2

  17. Jisho.org entry for 時計 (とけい). Reading とけい, meaning "clock; watch; timepiece," JLPT N5, common word. https://jisho.org/word/%E6%99%82%E8%A8%88 2

  18. Jisho.org entry for 統計 (とうけい). Reading とうけい, meaning "statistics," JLPT N2, common word. https://jisho.org/word/%E7%B5%B1%E8%A8%88 2

  19. Jisho.org entry for 東京 (とうきょう). Reading とうきょう, meaning "Tokyo (capital city)," common word. https://jisho.org/word/%E6%9D%B1%E4%BA%AC 2 3