The Japanese Vowel Inventory: Five Vowels, Done Right
Japanese has a five-phoneme vowel system: /a i ɯ e o/. Each vowel is a pure monophthong, and each has a long counterpart that can change word meaning.123 This page places every vowel on the IPA chart, explains why う is compressed rather than rounded, drills length with attested minimal pairs, and previews vowel devoicing. The goal is to give the rest of the phonology curriculum a clean foundation.
Overview
What "five vowels" means and what it does not
Standard (Tokyo) Japanese contrasts five vowel phonemes: /a, i, ɯ, e, o/.123 A phoneme is an abstract sound category, not a single audible sound. Each of the five has both a short and a long form, and this length contrast is phonemic: it can distinguish words. Each vowel also has slightly different surface realizations depending on the surrounding consonants, including the devoicing of /i/ and /ɯ/ between voiceless consonants.14
"Five vowels" is therefore a statement about the inventory of contrasts, not about every sound a speaker may produce. English contrasts roughly thirteen to twenty distinct vowel qualities, depending on dialect and how diphthongs are counted. Japanese contrasts five, doubled by length.45
Spanish also has a five-vowel system /i, u, e, o, a/ of pure vowels. That is why the "Japanese vowels are like Spanish" shortcut circulates so widely.6 It breaks down on /ɯ/ (Spanish /u/ is rounded with lip protrusion; Tokyo /ɯ/ is unrounded or compressed without protrusion) and on the lowered surface targets of /e/ and /o/.13786
The five-vowel inventory describes the Standard (Tokyo) variety. Some regional varieties differ in degree of rounding on /ɯ/; Kansai /u/ is more rounded than Tokyo /ɯ/.78
How this article sits next to the consonant inventory and the mora article
This page is the vowel half of the two-part phoneme inventory. Its companion is the Japanese consonant inventory. It also fills in the V slot, or vowel slot, of the four mora types defined under mora versus syllable. That article provides the timing framework these vowels sit inside.
JLPT and learner relevance
The vowel inventory is not a graded JLPT topic. The official JLPT level summary describes the test as covering reading and listening, with no phonology or pronunciation component enumerated at any level.9
The five vowels and their length contrasts appear in every Japanese word. That makes accurate articulation on this small set one of the highest-leverage corrections an English-trained learner can make.410 Two related topics have their own dedicated pages, so this article only cross-references them: long-vowel orthography (the おう versus おお hiragana convention and the chōonpu ー in katakana), and vowel devoicing.
The five vowels in IPA
The chart: place and height for /a i ɯ e o/
Standard Tokyo Japanese has five vowel phonemes, conventionally written /a, i, ɯ, e, o/ in IPA broad transcription.123 The narrow phonetic realizations heard in Tokyo are slightly different from those ideal chart targets. /a/ surfaces as central [ä], /i/ as a tense [i] (sometimes transcribed [ɪ̟] to mark a touch of laxing), /ɯ/ as unrounded or compressed near-back [ɯ̟] or [ɯ̹̽], and /e/ and /o/ as lowered close-mid [e̞] and [o̞].13
| Phoneme (broad) | Narrow phonetic | Tongue height | Frontness | Lip shape |
|---|---|---|---|---|
| /a/ | [ä] | Open | Central | Unrounded |
| /i/ | [i] | Close | Front | Unrounded, lightly spread |
| /ɯ/ | [ɯ̟] or [ɯ̹̽] | Close (near-close) | Central-back | Unrounded or compressed; flat lips |
| /e/ | [e̞] | Lowered close-mid | Front | Unrounded |
| /o/ | [o̞] | Lowered close-mid | Back | Lightly rounded, no protrusion |
The same data sits naturally on an IPA vowel trapezoid.
References disagree on whether to write the high back vowel as /u/ or /ɯ/ at the broad phonemic level. Help:IPA/Japanese uses /ɯ/ phonemically. The main English Wikipedia article on Japanese phonology uses /u/ in some sections and [ɯ] in narrow transcription. Japanese Wikipedia uses /u/.123 This article writes /ɯ/ to keep the broad symbol consistent with the narrow realization, with the rounding caveat spelled out under that vowel.
/a/: a flat central low vowel
/a/ surfaces as [ä] in Tokyo Japanese: central rather than fully back, and not fully open.13 To produce it, open the mouth in a neutral position with the tongue low and central. Add no glide on the way in or out.4
The closest English approximation is the vowel in British "father" (long [ɑː] flattened toward central), or Spanish a. It is not the [æ] of American "cat".2
雨1
"rain"
朝4
"morning"
山4
"mountain"
/i/: a tense close front vowel, no laxing
/i/ surfaces as [i], close and front, with the lips slightly spread.13 The same target serves the "ee" in English "see". In Japanese, hold it steady without the diphthongal off-glide some English speakers add.4
Japanese /i/ is never the lax [ɪ] of English "sit". 聞く (きく "to listen") starts on a tense [i], not [ɪ].11
行く4
"to go"
いい4
"good"
聞く11
"to listen, to hear"
/ɯ/: compressed, not rounded, and not English "oo"
Help:IPA/Japanese is explicit on the lip shape: "in Tokyo dialect, [/u/] is either unrounded or compressed."2 In the compressed realization, the side portions of the lips come into contact, but there is no salient protrusion. That lack of forward lip movement separates Tokyo う from the strong forward push of English [u] in "boot" or Spanish [u] in "tu".12786
Ultrasound work by Nogita, Yamane and Bird found that, in six of seven native speakers, the tongue position of Standard Japanese /ɯ/ was closer to the front vowel /e/ than to the back vowel /o/. All seven actively rounded their lips, and at least four showed clear lip protrusion. This led the authors to argue that the more accurate phonemic IPA symbol is /ʉ/, a rounded central vowel.7 The takeaway for a learner is the lip shape and the central tongue position, not which symbol wins.
In front of a mirror, say English "oo" and watch your lips push forward into a tight circle. Then say Japanese う while keeping the corners of your mouth pulled gently in toward each other, with the lips relaxed and almost flat. No visible protrusion means a compressed う. Visible protrusion means English [u] is leaking in.78
Phonology textbooks have used /u/ and /ɯ/ interchangeably for Tokyo Japanese for decades. /ɯ/ is predominant in modern references.12 Recent acoustic and ultrasound work argues that the more accurate narrow symbol is [ɯ̹̽] (compressed, slightly fronted) or [ʉ̜] / [ʉ – ʏ] (unrounded central-to-front).78 Japanese Wikipedia uses [ɯ̹̽] in narrow transcription.3
Regional caveat: Kansai speakers produce /u/ with stronger lip-rounding than Tokyo speakers, so the compressed-/ɯ/ description is specifically a Tokyo-standard claim.78
海1
"sea"
歌4
"song"
うなぎ2
"eel"
/e/: a lowered close-mid front vowel
/e/ surfaces as [e̞] in Tokyo Japanese: midway between cardinal close-mid [e] and open-mid [ɛ].13 To produce it, use a mid-front tongue with the lips relaxed and slightly spread. Aim a little more open than cardinal [e], so the target lands between the English "ay" in "say" (without the off-glide) and the "e" in "set".4
Japanese /e/ is steady from start to finish. え in 駅 (えき, "(train) station") never has the [eɪ]-style off-glide of English "ache".12
駅12
"(train) station"
絵13
"picture, drawing"
今4
"now"
/o/: a lowered close-mid back vowel, lightly rounded
/o/ surfaces as [o̞] in Tokyo Japanese: midway between cardinal close-mid [o] and open-mid [ɔ].13 The lips are lightly rounded, but the rounding is much weaker than English [oʊ] in "go" and uses lip compression rather than protrusion.1378 Aim between English "oh" (held steady, with no off-glide) and the "o" in "saw" in some American dialects.4
The anti-diphthongization warning matters most here. お in お父さん (おとうさん, "father") is not "oh-w". It stays on [o̞] for its full mora, then transitions to the next vowel as a separate timing slot. The おう sequence in this word is /o/ plus a length mark, not a glide.414
お父さん14
"father, dad"
男4
"man"
おに2
"demon, ogre"
Why /e/ and /o/ are written as mid lowered
Older or simpler reference inventories write /e/ and /o/ as cardinal close-mid [e] and [o].1516 Modern acoustic descriptions and the Japanese Wikipedia treatment place the Tokyo targets slightly lower, which is why this article uses the [e̞] and [o̞] narrow notation. Japanese Wikipedia explicitly writes "半狭母音 [e] と半広母音 [ɛ] の中間音 [e̞]" for /e/ and "半狭母音 [o] と半広母音 [ɔ] の中間音 [o̞͑]" for /o/.13
For learners, the takeaway is simple: aim a little more open than cardinal [e] and [o]. Avoid the closed, tense [i]-like or [u]-like targets that learners sometimes drift toward when they read "e" or "o" through English orthographic habits.4
Vowels are pure: the anti-diphthong rule
Monophthongs vs diphthongs in one paragraph
A monophthong is a vowel articulated with a stable, unchanging quality throughout its duration.17 A diphthong is a vowel that glides from one quality to another within the same syllable, as in English "say" [seɪ] or "no" [noʊ].18
Every Japanese vowel phoneme is a monophthong: the same articulatory target is held for the duration of the mora.134 This has an orthographic consequence. Kana spelling can use one symbol per vowel because each vowel is one steady quality. It is also why two written vowels in a row (おう, えい, あい) are two separate morae rather than a glide on a single nucleus.14
Where English speakers leak a glide
Four leak sites recur in English-trained pronunciation. The correction is the same in each case: hold the target quality steady for the full mora, then count morae deliberately when two vowels meet.4
- /e/ said as English "ay" with an [ɪ] off-glide (English [eɪ] for Japanese steady [e̞]).4
- /o/ said as English "oh" with a [ʊ] off-glide (English [oʊ] for Japanese steady [o̞]).4
- /a/ said as English "I" (the [aɪ] diphthong) when the kana is just あ.4
- Two-vowel kana sequences like あい read as a single [aɪ] glide instead of two morae of [ä.i].14
Vowel sequences are not diphthongs
あい in 愛 (ai) is two morae, not [aɪ]. The same pattern holds for おう in 通る (とおる, where the spelling おお marks a long /o/), えい in せい, and あい / おい sequences generally.14202119 Mora counting and pure-vowel articulation reinforce each other: pronouncing each vowel as a steady monophthong forces the timing into discrete morae.45
遠い21
"far, distant"
駅員12
"station attendant"
The spelling conventions for long vowels live in the dedicated articles on long vowels in hiragana (the おう versus おお split) and long vowels in katakana (the chōonpu ー), not on this page.
Vowel length is phonemic
Short vowel = one mora, long vowel = two morae
Every Japanese vowel has a short form (one mora) and a long form (two morae). A long vowel is the same vowel quality held for an extra timing slot, not a different vowel.134 Wikipedia's Japanese-phonology article puts it directly: "All vowels display a length contrast: short vowels are phonemically distinct from long vowels."1
In IPA, the long form is written with the length mark ː. Tokyo Japanese has /aː, iː, ɯː, eː, oː/ alongside the five short vowels, for ten contrastive vowel units (five times two).1215 How those long vowels are written in kana is the job of the writing-system articles, not this one.
Minimal pairs you can drill
The contrast is lexical: mishearing length changes the word, not just the prosody or rhythm. "Uncle" and "grandfather", or "snow" and "courage", are not paraphrases of each other.1422232425
| Vowel | Short | Long | What changes |
|---|---|---|---|
| /a/ vs /aː/ | おばさん obasan "aunt" | おばあさん obāsan "grandmother" | second /a/ held for two morae12627 |
| /i/ vs /iː/ | おじさん ojisan "uncle" | おじいさん ojīsan "grandfather" | second /i/ held for two morae2225 |
| /ɯ/ vs /ɯː/ | ゆき yuki "snow" | ゆうき yūki "courage" (勇気) | first /ɯ/ held for two morae2324 |
| /e/ vs /eː/ | え e "picture" (絵) | ええ ē "yes" (interjection) | /e/ held for two morae2813 |
| /o/ vs /oː/ | とる toru "to take" (取る) | とおる tōru "to pass through" (通る) | /o/ held for two morae2029 |
Wikipedia's own narrow-transcription examples reinforce the contrast: [obasaɴ] vs [obaːsaɴ], [kɯ] vs [kɯː], and [tokai] vs [toːkai].1
勇気23
"courage"
通る20
"to pass through"
Why English ears miss the contrast
English uses vowel length non-contrastively: length alone usually does not distinguish words. The vowels in "bit" and "beat" differ in quality, not just length. "Bid" sounds longer than "bit" because of the voiced coda rather than because length itself carries meaning.4 Learners trained on English often give a long Japanese vowel some quality change, a tenser or laxer color, instead of just holding it longer.4
The corrective is to keep the vowel quality fixed and stretch the duration to a full second mora. This connects directly to the mora-counting procedure introduced under mora versus syllable.
Vowel devoicing in one paragraph (preview)
When /i/ and /ɯ/ go silent between voiceless consonants
In Standard (Tokyo) Japanese, the close vowels /i/ and /ɯ/ are devoiced (the vocal folds stop vibrating) when they sit between two voiceless consonants, or between a voiceless consonant and a pause. The mora is still there; only the voicing of the vowel disappears.124 This is why です sounds like "des", ました sounds like "mash-ta", and 好き sounds like "ski".3031 The detailed rule, the consecutive-devoicing conflict, regional variation in Tohoku and Kansai, and speaker-to-speaker variability all live in the dedicated article on vowel devoicing. This page only flags the phenomenon so a reader who has heard "des" knows where it comes from.
好き30
"(I) like (it)"
Good to know
The Spanish-vowels comparison is half right
A common first-week framing tells learners that "Japanese vowels are just like Spanish vowels." Japanese and Spanish do share a five-vowel inventory of pure (monophthongal) vowels, but the comparison breaks down on /ɯ/ and on the lowered surface targets of /e/ and /o/.13786 Spanish /u/ is a fully rounded back vowel with lip protrusion. Tokyo /ɯ/ is unrounded or compressed without protrusion. Tokyo /e/ and /o/ are lowered to [e̞] and [o̞], slightly more open than the cardinal close-mid targets some descriptions of Spanish use.2786 Borrow Spanish only as a memory hook for "the vowels are pure" and stop there.
Diphthongizing /e/ as English [eɪ]
The wrong form pronounces えき as [eɪki], adding the English "ay" off-glide on え. The right form holds a steady [e̞] for the full mora and then moves to き. English speakers often add a high-front off-glide to mid-front vowels, but the Japanese target is a held monophthong.4
駅12
"(train) station"
Diphthongizing /o/ as English [oʊ]
The wrong form pronounces おとうさん as [oʊtoʊsaɴ], stacking English "oh-w" glides on each お and on the おう sequence. The right form is [o̞to̞ːsaɴ]: a steady short [o̞] on the first お and a steady long [o̞ː] across おう. The おう sequence in this word spells a long /oː/, not a glide. The rule is monophthongal quality with length on the timing tier.1414
お父さん14
"father, dad"
The /ɯ/ symbol controversy is a research-literature issue
Textbooks use /u/ and /ɯ/ interchangeably for Tokyo Japanese, with /ɯ/ predominant in modern references. Recent acoustic and ultrasound work argues for /ɯ̹̽/, /ʉ̜/, or even /ʉ – ʏ/ to mark compression and slight fronting.12378 A learner does not need to pick a side. They need the lip shape, which all of these symbols describe. This article uses /ɯ/ for consistency with the companion piece on the Japanese consonant inventory.
Native term: 母音 (boin)
The Japanese linguistic term for "vowel" is 母音 (boin), literally "mother sound". It pairs with 子音 (shion, "child sound") for "consonant".3233 Both terms are standard in Japanese-language pedagogy and academic phonology, and they will appear in any Japanese-side reference a learner consults.33233 The "mother / child" pairing is a calque of the Chinese phonological tradition (母 mǔ "mother" and 子 zǐ "child"). The metaphor is that vowels are the syllable nuclei that consonants attach to.3233
Why the inventory feels small
Five vowels is below the cross-linguistic average. Most languages have between five and seven vowel qualities, and English has between thirteen and twenty depending on dialect and on how diphthongs are counted.45 The "cost" of fewer quality contrasts is offset by phonemic length doubling the inventory (five short plus five long equals ten contrastive vowel units), and by mora timing forcing every vowel to land cleanly on a discrete beat.145 The system is not "simpler". It is differently distributed, with less quality differentiation and more length and timing differentiation.
See also
- Why "Tokyo" Is Two Syllables in English and Four Morae in Japanese: Loanwords as a Timing Drill
- Japanese Pronunciation Drills: A Daily 5-Minute Protocol with Minimal Pairs, Shadowing, and Record-and-Compare
- Difficult Japanese Sounds by Native Language: An L1-by-L1 Pronunciation Guide
- Stress vs. Pitch: Does Japanese Have Stress?
- The Mora-N (ん) and Its Four Allophones