Japanese Vowel Devoicing: Why です Sounds Like "Des"
Japanese vowel devoicing is a regular pronunciation rule in standard Tokyo speech. The high vowels /i/ and /u/ lose their vocal-fold vibration when squeezed between voiceless consonants, or when they come after one at the end of a phrase. That is why です surfaces as something an English ear hears as "des".1 The rule appears from your first lesson because it affects です, ます, した, すき, and しつれい. Ignoring it is the difference between textbook-clean Tokyo speech and shadowing audio that sounds slightly off.
Acoustic studies of Tokyo Japanese show that the tongue and lips still execute the vowel gesture when /u/ is devoiced; only the vocal folds fail to vibrate. The mora still occupies one beat of timing, so です is two morae, never one.2
Overview
What "devoicing" actually means
A devoiced vowel is one where the articulators (tongue, lips, jaw) form the vowel exactly as for the voiced version, but the vocal folds do not vibrate during it.3 The mouth still makes the vowel; the voice simply does not turn on for that mora.
The phonetics literature is precise about this: "sounds that should be voiced change to voiceless," with the oral gesture left intact.3 Listeners still recover the vowel identity because traces of the oral gesture survive in the burst and frication noise of the surrounding consonants.2
The IPA marks a devoiced vowel with the under-ring diacritic U+0325: [i̥] for devoiced /i/, [ɯ̥] (or [u̥]) for devoiced /u/. In standard Tokyo speech, です is therefore transcribed as [desɯ̥], not [des].45
Because the articulation is preserved, devoicing is not deletion. The mora still counts as one beat for mora-timing.67
Which vowels devoice (only い and う)
Devoicing in standard Japanese is restricted to the two high (close) vowels /i/ and /u/.46 The low and mid vowels /a/, /e/, /o/ do not devoice under the same conditions in standard Tokyo Japanese.43
The phonetic reason is aerodynamic. High vowels are naturally short and made with a narrow mouth opening. In a /CVC/ sequence flanked by two voiceless consonants, the sound is already close to a fully voiceless string, so the vocal-fold vibration that would carry the vowel simply fails to engage.67
Where it sits on your JLPT timeline
Devoicing is a day-one phenomenon for any learner using NHK-aligned audio or a standard textbook. The copula です, the polite verb ending ます, and high-frequency words like した (past of する), すき (好き, "like"), and しつれい (失礼, "excuse me") all surface with a devoiced vowel in their canonical Tokyo pronunciation.89
The JLPT does not test devoicing as a separate grammar item at any level. It does grade against it in practice: listening sections at every level use Tokyo-standard speech, where devoicing is the default, and N5–N3 shadowing practice is judged against that standard.8
The devoicing rule
The rule has two triggering environments, a fixed list of consonants that allow it, and a short practical lookup of morae that covers almost every case a learner meets.
The two triggering environments
In standard Tokyo Japanese, a high vowel /i/ or /u/ devoices in exactly two contexts.46
- Between two voiceless consonants. /i/ or /u/ is devoiced when preceded by a voiceless consonant and followed by another voiceless consonant within the same word.910
- After a voiceless consonant, before a pause. /i/ or /u/ is also devoiced when it follows a voiceless consonant and stands at the end of an utterance or at a phrase boundary with a pause.910
Tanner, Sonderegger and Torreira summarise both as one rule: "high vowels /i/ and /u/ are near-obligatorily devoiced between two voiceless consonants or following a voiceless consonant pre-pausally" in Tokyo Japanese.1
The voiceless consonants that trigger it
The voiceless consonants of Japanese that allow devoicing are /p t k s ɕ ts tɕ ɸ h ç/.47 In kana-row terms, these are カ行 (k), サ行 (s, including シ [ɕ]), タ行 (t, including チ [tɕ] and ツ [ts]), ハ行 (h, including ヒ [ç] and フ [ɸ]), and パ行 (p).1110
| IPA | Row | Kana that head the row |
|---|---|---|
| /p/ | パ行 | パ ピ プ ペ ポ |
| /t ts tɕ/ | タ行 | タ チ ツ テ ト |
| /k/ | カ行 | カ キ ク ケ コ |
| /s ɕ/ | サ行 | サ シ ス セ ソ |
| /h ç ɸ/ | ハ行 | ハ ヒ フ ヘ ホ |
Any consonant outside these five rows is voiced (ガ行, ザ行, ダ行, バ行, plus the sonorants of ナ行, マ行, ラ行, ヤ行, ワ行), and a high vowel adjacent to those will not devoice.47
The "devoiceable morae" shortcut
A practical shortcut is to memorise the small set of morae that have the right ingredients: a voiceless consonant plus a high vowel. Those morae are キ ク シ ス チ ツ ヒ フ ピ プ, with the corresponding yōon シュ キュ チュ ヒュ ピュ フィ.
Any mora outside this set either has the wrong vowel (not /i/ or /u/) or the wrong consonant (voiced), so devoicing simply does not apply.
If you commit only one thing to memory, make it the ten-mora list above. The full /CVC/ rule is the explanation; the shortlist is what you reach for when you see a new word and need to decide whether a mora is whispered.
Worked examples from N5 and N4 vocabulary
The examples below all come from core N5/N4 vocabulary and are recorded with the devoiced vowel marked in the NHK accent dictionary.8 In each transcription, the devoiced mora is the underlined position.
Word-internal devoicing (between voiceless consonants)
切符8
"ticket"
The き sits between /k/ and the geminate /p/, so its /i/ devoices: [ki̥ppɯ].
明日8
"tomorrow"
The し sits between /ɕ/ and /t/, so its /i/ devoices: [aɕi̥ta].
近い8
"close; nearby"
The ち sits at the head of the word before /k/, so its /i/ devoices: [tɕi̥kai].
鉛筆8
"pencil"
The ぴ sits between /p/ and /ts/, so its /i/ devoices: [empi̥tsɯ].
薬8
"medicine"
The く sits between /k/ and /s/, so its /u/ devoices: [kɯ̥sɯɾi].
月8
"moon; month"
The つ sits at the head of the word before /k/, so its /u/ devoices: [tsɯ̥ki].
普通8
"ordinary; usual"
The ふ sits between /ɸ/ and /ts/, so its /u/ devoices: [ɸɯ̥tsɯː].
失礼9
"excuse me; rude"
The し sits between /ɕ/ and /ts/, so its /i/ devoices: [ɕi̥tsɯɾeː].
All eight items appear with the marked vowel devoiced in NHK's accent dictionary entries.8 The devoiced mora is still pronounced as one beat for timing: えんぴつ is heard as four beats, not three.67
Word-final devoicing (です, ます, and the polite suffixes)
The most common cases for learners are the copula です, the polite verb ending ます, and the past form ました. In all three, a high vowel sits after a voiceless consonant at the end of the phrase, so it devoices.
元気です。8
"I'm well."
The final す devoices: [geŋki desɯ̥].
学生です。8
"I'm a student."
Same final す, same devoicing: [gakɯ̥seː desɯ̥]. Note the internal く of 学生 also devoices, between /k/ and /s/.
行きます。9
"I will go."
The final す devoices: [ikimasɯ̥]. In the past form 行きました, the same pattern places the devoicing on the し of -ました, between /ma/ and /ta/: [ikimaɕi̥ta].
好きです。10
"I like (it)."
There are two devoiceable positions close together. The default Tokyo pattern is to surface the second one as voiceless and let the first hold its voicing: [sɯki desɯ̥]. The consecutive-devoicing rule that drives this choice is covered below.
Mixed cases: devoicing across kanji compounds
Compounds give some of the clearest demonstrations of the rule when they put a devoiceable mora next to a geminate (sokuon っ) or a second voiceless cluster. The geminate counts as voiceless on both sides.67
学期8
"academic term"
The き sits between the geminate /k/ and the end of the word; its /i/ devoices: [gakki̥].
拍手8
"applause"
The く sits between /k/ and /ɕ/; its /u/ devoices: [hakɯ̥ɕɯ].
一致8
"agreement; match"
The ち sits between the geminate /t/ and the end of the word; its /i/ devoices: [ittɕi̥].
A geminate (sokuon っ) is itself voiceless: in きっぷ the っ is the unreleased onset of /p/, so the preceding /i/ of き sits between two voiceless consonants and devoices.67 The same logic applies to がっき (/k_k/) and いっち (/t_tɕ/).
When devoicing is mandatory vs. variable
The common internet phrase "the u is silent" turns a graded picture into a binary one. More accurately, devoicing in the canonical environment is near-obligatory in Tokyo, but tempo, emphasis, song, and the consecutive-devoicing constraint all introduce real variation.
Effectively obligatory in standard Tokyo speech
In the canonical environment, devoicing in Tokyo Japanese is the default and unmarked realisation. Tanner, Sonderegger and Torreira state the consensus directly: "high vowels /i/ and /u/ are near-obligatorily devoiced between two voiceless consonants or following a voiceless consonant pre-pausally" in Tokyo Japanese.1
The Corpus of Spontaneous Japanese (CSJ) shows the same pattern. In the simple canonical environment, devoicing rates in Tokyo speech are very high. Kilbourn-Ceron and Sonderegger treat the rule as the default, with variability concentrated at boundary positions and in consecutive-devoicing environments.1213
Where speakers vary: tempo, emphasis, and singing
Slower careful speech, contrastive emphasis, and listing intonation routinely restore the vowel. Announcers reading place names slowly, or teachers articulating a syllable for a learner, will often voice the /u/ in です fully.89
Singing typically restores the vowel because the melody requires a sustainable pitch on each mora; in song, even canonically devoiced morae are usually sung with full voicing.7 Beginner-oriented textbook recordings sometimes restore the vowel for pedagogical clarity; this is a teaching artefact, not the natural Tokyo norm.9
Tanner et al.'s durational study adds an important nuance: high-vowel devoicing in Tokyo Japanese is categorical, not gradual reduction. Vowels are either devoiced or fully present, rather than progressively shortened along a continuum.1
Consecutive devoicing (the every-other-mora pattern)
When two devoiceable morae sit next to each other in a word, Japanese avoids devoicing both in a row; typically only one of the two surfaces as voiceless, and the other retains its voicing.1410
The default Tokyo pattern is that the first of the two surfaces as voiceless and the second keeps its voicing. Both orders are attested, however, and the pattern is sensitive to the manner of the surrounding consonants and the height of the vowel in the following mora.1412 The CSJ-based study reports an overall rate of consecutive (double) devoicing of about 27%, confirming that devoicing both morae at once is the dispreferred outcome.1213
Teacher-training materials cite these standard alternations:
きくしつ9
"listening room (a teacher-training citation form, not a single-kanji compound)"
The first 〈き〉 is devoiced [ki̥]; the following 〈く〉 (and the し of しつ) keep their voicing rather than chaining the devoicing across adjacent morae.
複数10
"plural"
The first 〈ふ〉 is devoiced [ɸɯ̥]; the following 〈く〉 retains voicing rather than both vowels devoicing in a row.
好きです。12
"I like it."
The second devoiceable position surfaces voiceless: 〈す〉 of 好き holds voicing, and the final 〈す〉 of です devoices with the usual final-/u/ pattern.
Tsuchida's account treats consecutive devoicing as phonetically driven, not strictly phonological. That is why the surface pattern can shift with tempo and emphasis.1412
Regional differences: Kansai keeps the vowels voiced
Devoicing is a feature of standard Tokyo Japanese, not of Japanese as a whole. The biggest regional exception is Kansai-ben, where the same words surface with the vowel fully voiced.
Kansai (Osaka, Kyoto, Kobe)
Standard Kansai-ben does not devoice high vowels in the same environments as Tokyo Japanese. The Japanese-teacher source states the rule directly: "関西人は母音をしっかり発音するので母音の無声化が生じにくいです" ("Kansai speakers articulate vowels solidly, so vowel devoicing rarely occurs").11
The frequency picture from cross-dialect studies is the same: "the frequency of vowel devoicing occurrence is high in dialects of eastern Japan including standard (Tokyo) Japanese and low in dialects of western Japan including Osaka dialect."415
For learners, the practical consequence is that です in Kansai is closer to a fully voiced [desu], and 好き is closer to a fully voiced [suki]. A learner who shadows Osaka-based media (for example, ytv or MBS) will pick up a less-devoicing pattern. This is correct for Kansai-ben, but it is not the NHK standard.811
Other regional patterns
Tōhoku dialects also devoice, but the conditioning factors differ from Tokyo. In Tohoku Japanese, devoicing is sensitive to the height of the vowel in the following syllable and to additional positional factors that do not apply in Tokyo.15 Northern dialects can also extend devoicing to environments where Tokyo keeps the vowel voiced, with some Tohoku speakers devoicing /i/ and /u/ even between voiced consonants.15
The Tokyo-vs-Kansai contrast dominates learner-facing material. The other regional patterns are summarised here for completeness and are out of scope for an N5/N4 article.1115
Which one should a learner imitate?
For most learners, the default answer is: devoice. The standard JLPT listening materials, the NHK accent dictionary, and the audio for the major N5/N4 textbooks all use Tokyo-standard pronunciation, which devoices in the canonical environment.89
If a learner's primary shadowing source is Osaka-based media, the no-devoicing pattern is also internally consistent and locally appropriate. The practical recommendation in the teacher-training literature is simple: do not mix mid-sentence.911
Switching between Tokyo devoicing and Kansai full voicing inside the same sentence sounds neither like standard Japanese nor like Kansai-ben; it sounds like an inconsistent learner. Choose your model based on the audio you actually shadow, and let your pronunciation be internally consistent with that source.911
Good to know
The "u is silent" misconception
Describing です as "des with a silent u" is the most common English-language framing of devoicing. It is misleading in two specific ways. First, the tongue and lips still execute the /u/ gesture; only the vocal folds fail to vibrate. Second, the mora still occupies one beat of timing.623
The acoustic record bears this out. Whang's study found formant-like structures in the burst and frication noise of the surrounding consonants. These show that the oral vowel gesture is retained even when phonation is lost.2 Because the mora is preserved, です remains two morae for rhythm purposes (the same beat count as でて or でわ), not one.67
Romaji can't show devoicing
Hepburn writes desu whether the /u/ is voiced or whispered, because the romanisation system simply does not encode devoicing.6 The precise notation is IPA with the under-ring diacritic [u̥] / [ɯ̥]; the IPA Handbook assigns U+0325 to the "voiceless" diacritic for symbols normally voiced.5 Thus [desɯ̥] in a phonetics textbook is the same word as romanised desu. The difference is precision, not phonology.46
Pitch and devoicing interact
A devoiced mora cannot carry an audible pitch peak, because there is no voicing for the tone to ride on. When the accented mora of a word would be devoiced, the accent can shift to an adjacent mora.1617
Hasegawa's ICPhS paper documents the interaction directly: "if the accented mora of a word becomes devoiced, then the accent may shift to the next mora (so as to 'avoid' landing on a voiceless mora)."1617 NHK and Shinmeikai accent dictionaries note this shift in their entries. Learners who use OJAD or a similar pitch-accent tool will see the accent marked on a different mora for words like 聞く ("to listen") depending on whether the speaker devoices the first mora.
Mora count is preserved
Even when the vowel is whispered, the mora still occupies one beat for mora-timing and song lyrics. This is why です is two morae, not one, and why ました is three morae despite the typical devoicing on し.67 The rhythm payoff of preserving the mora count is covered in the mora-timing framework, where the difference between mora and syllable does the explanatory work.
One word, many transcriptions
English-language materials transcribe the same Tokyo realisation [desɯ̥] in several ways: "desu", "des", "des(u)", or [desɯ̥]. The choice depends on the writer's intended audience, not on the underlying word.46
Romaji teaching aids typically write desu because it is easier to recover the spelling. Phonetic transcriptions write [desɯ̥] because it captures what is actually heard. Learner-friendly prose sometimes writes "des" because it captures the surface impression for an English-speaking reader.469 None of the four is wrong. They encode different levels of detail about the same form.46
See also
- Stress vs. Pitch: Does Japanese Have Stress?
- The Mora-N (ん) and Its Four Allophones
- Rendaku: When K Becomes G in Compound Words
- Regional Pitch Accent in Japanese: Kansai (Keihan), Tohoku, and the Accentless Dialects
- Japanese Pitch-Accent Notation: How to Read 0, 1, 2, 3 and the Overline Diagrams
- Common Romaji Mistakes That Mislead Pronunciation