Japanese Vowel Devoicing: Why です Sounds Like "Des"

Japanese vowel devoicing is a regular pronunciation rule in standard Tokyo speech. The high vowels /i/ and /u/ lose their vocal-fold vibration when squeezed between voiceless consonants, or when they come after one at the end of a phrase. That is why です surfaces as something an English ear hears as "des".¹ The rule appears from your first lesson because it affects です, ます, した, すき, and しつれい. Ignoring it is the difference between textbook-clean Tokyo speech and shadowing audio that sounds slightly off.

The vowel is whispered, not deleted

Acoustic studies of Tokyo Japanese show that the tongue and lips still execute the vowel gesture when /u/ is devoiced; only the vocal folds fail to vibrate. The mora still occupies one beat of timing, so です is two morae, never one.²

Overview

What "devoicing" actually means

A devoiced vowel is one where the articulators (tongue, lips, jaw) form the vowel exactly as for the voiced version, but the vocal folds do not vibrate during it.³ The mouth still makes the vowel; the voice simply does not turn on for that mora.

The phonetics literature is precise about this: "sounds that should be voiced change to voiceless," with the oral gesture left intact.³ Listeners still recover the vowel identity because traces of the oral gesture survive in the burst and frication noise of the surrounding consonants.²

The IPA marks a devoiced vowel with the under-ring diacritic U+0325: [i̥] for devoiced /i/, [ɯ̥] (or [u̥]) for devoiced /u/. In standard Tokyo speech, です is therefore transcribed as [desɯ̥], not [des].⁴⁵

Because the articulation is preserved, devoicing is not deletion. The mora still counts as one beat for mora-timing.⁶⁷

Which vowels devoice (only い and う)

Devoicing in standard Japanese is restricted to the two high (close) vowels /i/ and /u/.⁴⁶ The low and mid vowels /a/, /e/, /o/ do not devoice under the same conditions in standard Tokyo Japanese.⁴³

The phonetic reason is aerodynamic. High vowels are naturally short and made with a narrow mouth opening. In a /CVC/ sequence flanked by two voiceless consonants, the sound is already close to a fully voiceless string, so the vocal-fold vibration that would carry the vowel simply fails to engage.⁶⁷

Where it sits on your JLPT timeline

Devoicing is a day-one phenomenon for any learner using NHK-aligned audio or a standard textbook. The copula です, the polite verb ending ます, and high-frequency words like した (past of する), すき (好き, "like"), and しつれい (失礼, "excuse me") all surface with a devoiced vowel in their canonical Tokyo pronunciation.⁸⁹

The JLPT does not test devoicing as a separate grammar item at any level. It does grade against it in practice: listening sections at every level use Tokyo-standard speech, where devoicing is the default, and N5–N3 shadowing practice is judged against that standard.⁸

The devoicing rule

The rule has two triggering environments, a fixed list of consonants that allow it, and a short practical lookup of morae that covers almost every case a learner meets.

The two triggering environments

In standard Tokyo Japanese, a high vowel /i/ or /u/ devoices in exactly two contexts.⁴⁶

Between two voiceless consonants. /i/ or /u/ is devoiced when preceded by a voiceless consonant and followed by another voiceless consonant within the same word.⁹¹⁰
After a voiceless consonant, before a pause. /i/ or /u/ is also devoiced when it follows a voiceless consonant and stands at the end of an utterance or at a phrase boundary with a pause.⁹¹⁰

Tanner, Sonderegger and Torreira summarise both as one rule: "high vowels /i/ and /u/ are near-obligatorily devoiced between two voiceless consonants or following a voiceless consonant pre-pausally" in Tokyo Japanese.¹

The voiceless consonants that trigger it

The voiceless consonants of Japanese that allow devoicing are /p t k s ɕ ts tɕ ɸ h ç/.⁴⁷ In kana-row terms, these are カ行 (k), サ行 (s, including シ [ɕ]), タ行 (t, including チ [tɕ] and ツ [ts]), ハ行 (h, including ヒ [ç] and フ [ɸ]), and パ行 (p).¹¹¹⁰

IPA	Row	Kana that head the row
/p/	パ行	パピプペポ
/t ts tɕ/	タ行	タチツテト
/k/	カ行	カキクケコ
/s ɕ/	サ行	サシスセソ
/h ç ɸ/	ハ行	ハヒフヘホ

Any consonant outside these five rows is voiced (ガ行, ザ行, ダ行, バ行, plus the sonorants of ナ行, マ行, ラ行, ヤ行, ワ行), and a high vowel adjacent to those will not devoice.⁴⁷

The "devoiceable morae" shortcut

A practical shortcut is to memorise the small set of morae that have the right ingredients: a voiceless consonant plus a high vowel. Those morae are キクシスチツヒフピプ, with the corresponding yōon シュキュチュヒュピュフィ.

Any mora outside this set either has the wrong vowel (not /i/ or /u/) or the wrong consonant (voiced), so devoicing simply does not apply.

Memorise the shortlist before the rule

If you commit only one thing to memory, make it the ten-mora list above. The full /CVC/ rule is the explanation; the shortlist is what you reach for when you see a new word and need to decide whether a mora is whispered.

Worked examples from N5 and N4 vocabulary

The examples below all come from core N5/N4 vocabulary and are recorded with the devoiced vowel marked in the NHK accent dictionary.⁸ In each transcription, the devoiced mora is the underlined position.

Word-internal devoicing (between voiceless consonants)

切符きっぷ⁸
"ticket"

The き sits between /k/ and the geminate /p/, so its /i/ devoices: [ki̥ppɯ].

明日あした⁸
"tomorrow"

The し sits between /ɕ/ and /t/, so its /i/ devoices: [aɕi̥ta].

近ちかい⁸
"close; nearby"

The ち sits at the head of the word before /k/, so its /i/ devoices: [tɕi̥kai].

鉛筆えんぴつ⁸
"pencil"

The ぴ sits between /p/ and /ts/, so its /i/ devoices: [empi̥tsɯ].

薬くすり⁸
"medicine"

The く sits between /k/ and /s/, so its /u/ devoices: [kɯ̥sɯɾi].

月つき⁸
"moon; month"

The つ sits at the head of the word before /k/, so its /u/ devoices: [tsɯ̥ki].

普通ふつう⁸
"ordinary; usual"

The ふ sits between /ɸ/ and /ts/, so its /u/ devoices: [ɸɯ̥tsɯː].

失礼しつれい⁹
"excuse me; rude"

The し sits between /ɕ/ and /ts/, so its /i/ devoices: [ɕi̥tsɯɾeː].

All eight items appear with the marked vowel devoiced in NHK's accent dictionary entries.⁸ The devoiced mora is still pronounced as one beat for timing: えんぴつ is heard as four beats, not three.⁶⁷

Word-final devoicing (です, ます, and the polite suffixes)

The most common cases for learners are the copula です, the polite verb ending ます, and the past form ました. In all three, a high vowel sits after a voiceless consonant at the end of the phrase, so it devoices.

元気げんきです。⁸
"I'm well."

The final す devoices: [geŋki desɯ̥].

学生がくせいです。⁸
"I'm a student."

Same final す, same devoicing: [gakɯ̥seː desɯ̥]. Note the internal く of 学生 also devoices, between /k/ and /s/.

行いきます。⁹
"I will go."

The final す devoices: [ikimasɯ̥]. In the past form 行きました, the same pattern places the devoicing on the し of -ました, between /ma/ and /ta/: [ikimaɕi̥ta].

好すきです。¹⁰
"I like (it)."

There are two devoiceable positions close together. The default Tokyo pattern is to surface the second one as voiceless and let the first hold its voicing: [sɯki desɯ̥]. The consecutive-devoicing rule that drives this choice is covered below.

Romaji does not encode devoicing

Hepburn writes desu and masu whether the /u/ is voiced or whispered. The IPA notation [desɯ̥] is the precise rendering; the romanisations "desu", "des", and "des(u)" all describe the same Tokyo sound at different levels of detail.⁴⁶

Mixed cases: devoicing across kanji compounds

Compounds give some of the clearest demonstrations of the rule when they put a devoiceable mora next to a geminate (sokuon っ) or a second voiceless cluster. The geminate counts as voiceless on both sides.⁶⁷

学期がっき⁸
"academic term"

The き sits between the geminate /k/ and the end of the word; its /i/ devoices: [gakki̥].

拍手はくしゅ⁸
"applause"

The く sits between /k/ and /ɕ/; its /u/ devoices: [hakɯ̥ɕɯ].

一致いっち⁸
"agreement; match"

The ち sits between the geminate /t/ and the end of the word; its /i/ devoices: [ittɕi̥].

A geminate (sokuon っ) is itself voiceless: in きっぷ the っ is the unreleased onset of /p/, so the preceding /i/ of き sits between two voiceless consonants and devoices.⁶⁷ The same logic applies to がっき (/k_k/) and いっち (/t_tɕ/).

When devoicing is mandatory vs. variable

The common internet phrase "the u is silent" turns a graded picture into a binary one. More accurately, devoicing in the canonical environment is near-obligatory in Tokyo, but tempo, emphasis, song, and the consecutive-devoicing constraint all introduce real variation.

Effectively obligatory in standard Tokyo speech

In the canonical environment, devoicing in Tokyo Japanese is the default and unmarked realisation. Tanner, Sonderegger and Torreira state the consensus directly: "high vowels /i/ and /u/ are near-obligatorily devoiced between two voiceless consonants or following a voiceless consonant pre-pausally" in Tokyo Japanese.¹

The Corpus of Spontaneous Japanese (CSJ) shows the same pattern. In the simple canonical environment, devoicing rates in Tokyo speech are very high. Kilbourn-Ceron and Sonderegger treat the rule as the default, with variability concentrated at boundary positions and in consecutive-devoicing environments.¹²¹³

Fully voiced です sounds marked in Tokyo

A fully voiced [desu] in Tokyo speech sounds over-articulated. Announcers, voice actors, and teachers use it when they want to highlight the word, such as when correcting a learner, naming a place on a list, or stressing a word for contrast. It is not the unmarked realisation.⁸⁹

Where speakers vary: tempo, emphasis, and singing

Slower careful speech, contrastive emphasis, and listing intonation routinely restore the vowel. Announcers reading place names slowly, or teachers articulating a syllable for a learner, will often voice the /u/ in です fully.⁸⁹

Singing typically restores the vowel because the melody requires a sustainable pitch on each mora; in song, even canonically devoiced morae are usually sung with full voicing.⁷ Beginner-oriented textbook recordings sometimes restore the vowel for pedagogical clarity; this is a teaching artefact, not the natural Tokyo norm.⁹

Tanner et al.'s durational study adds an important nuance: high-vowel devoicing in Tokyo Japanese is categorical, not gradual reduction. Vowels are either devoiced or fully present, rather than progressively shortened along a continuum.¹

Consecutive devoicing (the every-other-mora pattern)

When two devoiceable morae sit next to each other in a word, Japanese avoids devoicing both in a row; typically only one of the two surfaces as voiceless, and the other retains its voicing.¹⁴¹⁰

The default Tokyo pattern is that the first of the two surfaces as voiceless and the second keeps its voicing. Both orders are attested, however, and the pattern is sensitive to the manner of the surrounding consonants and the height of the vowel in the following mora.¹⁴¹² The CSJ-based study reports an overall rate of consecutive (double) devoicing of about 27%, confirming that devoicing both morae at once is the dispreferred outcome.¹²¹³

Teacher-training materials cite these standard alternations:

きくしつ⁹
"listening room (a teacher-training citation form, not a single-kanji compound)"

The first 〈き〉 is devoiced [ki̥]; the following 〈く〉 (and the し of しつ) keep their voicing rather than chaining the devoicing across adjacent morae.

複数ふくすう¹⁰
"plural"

The first 〈ふ〉 is devoiced [ɸɯ̥]; the following 〈く〉 retains voicing rather than both vowels devoicing in a row.

好すきです。¹²
"I like it."

The second devoiceable position surfaces voiceless: 〈す〉 of 好き holds voicing, and the final 〈す〉 of です devoices with the usual final-/u/ pattern.

Tsuchida's account treats consecutive devoicing as phonetically driven, not strictly phonological. That is why the surface pattern can shift with tempo and emphasis.¹⁴¹²

Regional differences: Kansai keeps the vowels voiced

Devoicing is a feature of standard Tokyo Japanese, not of Japanese as a whole. The biggest regional exception is Kansai-ben, where the same words surface with the vowel fully voiced.

Kansai (Osaka, Kyoto, Kobe)

Standard Kansai-ben does not devoice high vowels in the same environments as Tokyo Japanese. The Japanese-teacher source states the rule directly: "関西人は母音をしっかり発音するので母音の無声化が生じにくいです" ("Kansai speakers articulate vowels solidly, so vowel devoicing rarely occurs").¹¹

The frequency picture from cross-dialect studies is the same: "the frequency of vowel devoicing occurrence is high in dialects of eastern Japan including standard (Tokyo) Japanese and low in dialects of western Japan including Osaka dialect."⁴¹⁵

For learners, the practical consequence is that です in Kansai is closer to a fully voiced [desu], and 好き is closer to a fully voiced [suki]. A learner who shadows Osaka-based media (for example, ytv or MBS) will pick up a less-devoicing pattern. This is correct for Kansai-ben, but it is not the NHK standard.⁸¹¹

Other regional patterns

Tōhoku dialects also devoice, but the conditioning factors differ from Tokyo. In Tohoku Japanese, devoicing is sensitive to the height of the vowel in the following syllable and to additional positional factors that do not apply in Tokyo.¹⁵ Northern dialects can also extend devoicing to environments where Tokyo keeps the vowel voiced, with some Tohoku speakers devoicing /i/ and /u/ even between voiced consonants.¹⁵

The Tokyo-vs-Kansai contrast dominates learner-facing material. The other regional patterns are summarised here for completeness and are out of scope for an N5/N4 article.¹¹¹⁵

Which one should a learner imitate?

For most learners, the default answer is: devoice. The standard JLPT listening materials, the NHK accent dictionary, and the audio for the major N5/N4 textbooks all use Tokyo-standard pronunciation, which devoices in the canonical environment.⁸⁹

If a learner's primary shadowing source is Osaka-based media, the no-devoicing pattern is also internally consistent and locally appropriate. The practical recommendation in the teacher-training literature is simple: do not mix mid-sentence.⁹¹¹

Pick one model, then stay in it

Switching between Tokyo devoicing and Kansai full voicing inside the same sentence sounds neither like standard Japanese nor like Kansai-ben; it sounds like an inconsistent learner. Choose your model based on the audio you actually shadow, and let your pronunciation be internally consistent with that source.⁹¹¹

Good to know

The "u is silent" misconception

Describing です as "des with a silent u" is the most common English-language framing of devoicing. It is misleading in two specific ways. First, the tongue and lips still execute the /u/ gesture; only the vocal folds fail to vibrate. Second, the mora still occupies one beat of timing.⁶²³

The acoustic record bears this out. Whang's study found formant-like structures in the burst and frication noise of the surrounding consonants. These show that the oral vowel gesture is retained even when phonation is lost.² Because the mora is preserved, です remains two morae for rhythm purposes (the same beat count as でて or でわ), not one.⁶⁷

Romaji can't show devoicing

Hepburn writes desu whether the /u/ is voiced or whispered, because the romanisation system simply does not encode devoicing.⁶ The precise notation is IPA with the under-ring diacritic [u̥] / [ɯ̥]; the IPA Handbook assigns U+0325 to the "voiceless" diacritic for symbols normally voiced.⁵ Thus [desɯ̥] in a phonetics textbook is the same word as romanised desu. The difference is precision, not phonology.⁴⁶

Pitch and devoicing interact

A devoiced mora cannot carry an audible pitch peak, because there is no voicing for the tone to ride on. When the accented mora of a word would be devoiced, the accent can shift to an adjacent mora.¹⁶¹⁷

Hasegawa's ICPhS paper documents the interaction directly: "if the accented mora of a word becomes devoiced, then the accent may shift to the next mora (so as to 'avoid' landing on a voiceless mora)."¹⁶¹⁷ NHK and Shinmeikai accent dictionaries note this shift in their entries. Learners who use OJAD or a similar pitch-accent tool will see the accent marked on a different mora for words like 聞く ("to listen") depending on whether the speaker devoices the first mora.

Mora count is preserved

Even when the vowel is whispered, the mora still occupies one beat for mora-timing and song lyrics. This is why です is two morae, not one, and why ました is three morae despite the typical devoicing on し.⁶⁷ The rhythm payoff of preserving the mora count is covered in the mora-timing framework, where the difference between mora and syllable does the explanatory work.

One word, many transcriptions

English-language materials transcribe the same Tokyo realisation [desɯ̥] in several ways: "desu", "des", "des(u)", or [desɯ̥]. The choice depends on the writer's intended audience, not on the underlying word.⁴⁶

Romaji teaching aids typically write desu because it is easier to recover the spelling. Phonetic transcriptions write [desɯ̥] because it captures what is actually heard. Learner-friendly prose sometimes writes "des" because it captures the surface impression for an English-speaking reader.⁴⁶⁹ None of the four is wrong. They encode different levels of detail about the same form.⁴⁶

References

Tanner, James, Morgan Sonderegger, and Francisco Torreira. "Durational Evidence That Tokyo Japanese Vowel Devoicing Is Not Gradient Reduction." Frontiers in Psychology 10:821. https://www.frontiersin.org/articles/10.3389/fpsyg.2019.00821/full ↩ ↩² ↩³ ↩⁴
Whang, James. "Recoverability-driven coarticulation: Acoustic evidence from Japanese high vowel devoicing." Journal of the Acoustical Society of America 143(2): 1159–1172. https://pubs.aip.org/asa/jasa/article/143/2/1159/606406 ↩ ↩² ↩³ ↩⁴
宇都木昭. 「無声の補助記号、無声鼻音、母音の無声化」. 『音声学入門』. https://utsugi-phonetics.com/phonetics_introduction/articulatory_phonetics/devoicing/ ↩ ↩² ↩³ ↩⁴
"Japanese phonology." Wikipedia. https://en.wikipedia.org/wiki/Japanese_phonology ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹²
International Phonetic Association. Handbook of the International Phonetic Association. Cambridge University Press, 1999. (Voicelessness diacritic U+0325, the under-ring.) ↩ ↩²
Vance, Timothy J. The Sounds of Japanese. Cambridge University Press, 2008. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶
Kubozono, Haruo (ed.). Handbook of Japanese Phonetics and Phonology. De Gruyter Mouton, 2015. https://www.degruyter.com/document/doi/10.1515/9781614511984/html ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰
NHK放送文化研究所編. 『NHK日本語発音アクセント新辞典』. NHK出版. https://www.monokakido.jp/ja/dictionaries/nhkaccent2/index.html ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷ ↩¹⁸ ↩¹⁹ ↩²⁰
東京外国語大学言語モジュール (TUFS Language Modules). 「日本語｜発音｜実践編 3.3.1 無声化母音の産出」. https://www.coelang.tufs.ac.jp/mt/ja/pmod/practical/03-03-01.php ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³
日本語教師のN1et. 「母音の無声化の条件とは┃仕組みと例」. https://jn1et.com/vowel-devoicing/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
日本語教師のはま. 「母音の無声化とは【関西はしにくい】」. https://www.hamasensei.com/museika/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
Kilbourn-Ceron, Oriana, and Morgan Sonderegger. "Boundary phenomena and variability in Japanese high vowel devoicing." Natural Language & Linguistic Theory, vol. 36, pp. 175–220. https://link.springer.com/article/10.1007/s11049-017-9368-x ↩ ↩² ↩³ ↩⁴ ↩⁵
Maekawa, Kikuo, and Hideaki Kikuchi. "Corpus-based analysis of vowel devoicing in spontaneous Japanese: An interim report." In Voicing in Japanese, Mouton de Gruyter, pp. 205–228. ↩ ↩²
Tsuchida, Ayako. Phonetics and Phonology of Japanese Vowel Devoicing. PhD dissertation, Cornell University. (Also published as: "Japanese Vowel Devoicing: Cases of Consecutive Devoicing Environments." Journal of East Asian Linguistics, vol. 10, pp. 225–245.) https://link.springer.com/article/10.1023/A:1011221225072 ↩ ↩² ↩³
Hirayama, Manami. "High Vowel Devoicing in Tohoku Japanese is Conditioned by Multiple Phonological Factors." Proceedings of the Annual Meetings on Phonology. https://journals.linguisticsociety.org/proceedings/index.php/amphonology/ ↩ ↩² ↩³ ↩⁴
Hasegawa, Yoko. "Pitch Accent and Vowel Devoicing in Japanese." Proceedings of the 14th International Congress of Phonetic Sciences (ICPhS), San Francisco, pp. 523–526. https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS1999/papers/p14_0523.pdf ↩ ↩²
"Japanese pitch accent." Wikipedia. https://en.wikipedia.org/wiki/Japanese_pitch_accent ↩ ↩²

Overview​

What "devoicing" actually means​

Which vowels devoice (only い and う)​

Where it sits on your JLPT timeline​

The devoicing rule​

The two triggering environments​

The voiceless consonants that trigger it​

The "devoiceable morae" shortcut​

Worked examples from N5 and N4 vocabulary​

Word-internal devoicing (between voiceless consonants)​

Word-final devoicing (です, ます, and the polite suffixes)​

Mixed cases: devoicing across kanji compounds​

When devoicing is mandatory vs. variable​

Effectively obligatory in standard Tokyo speech​

Where speakers vary: tempo, emphasis, and singing​

Consecutive devoicing (the every-other-mora pattern)​

Regional differences: Kansai keeps the vowels voiced​

Kansai (Osaka, Kyoto, Kobe)​

Other regional patterns​

Which one should a learner imitate?​

Good to know​

The "u is silent" misconception​

Romaji can't show devoicing​

Pitch and devoicing interact​

Mora count is preserved​

One word, many transcriptions​

See also​

References​

Footnotes​