Japanese Focus Prosody: Pitch Widening, Contrastive は, and Information Structure

Japanese focus prosody is the system that marks the emphasized part of a sentence by pitch shape, not by loudness as English often does. It widens the pitch range on the focused element and compresses everything that follows it.¹² If you already control lexical pitch accent and basic sentence intonation, focus prosody is the next layer up. It explains why a Japanese sentence can carry emphasis without sounding louder, and why Japanese speech can seem flat to an ear that is listening for a stressed syllable.

Overview

Focus prosody sits on top of the two prosodic layers covered elsewhere in the pronunciation track: lexical pitch accent on each word, and boundary intonation at the right edge of phrases and utterances. This article treats focus as a phrase-level property. It reshapes the pitch range of the accentual phrase carrying new or contrastive information, without inserting any new tones.³⁴²

What "focus" means in linguistics, in one paragraph

In Japanese linguistics, 焦点 (shōten, "focus") names the part of a clause that conveys new information or contrasts with alternatives.⁴⁵ Information focus is the new fact that answers a wh-question. Contrastive focus picks one element out against alternatives.

Both kinds are realized prosodically in Tokyo Japanese, and contrastive focus is the reliably more prominent type in production studies.¹² A short question-answer pair makes the information-focus case concrete.

何なにを買かいましたか。¹
"What did you buy?"

パンを買かいました。¹
"I bought bread."

The new information in the answer is パン. That is the focus, and the prosody of the answer should put pitch-range expansion on パン and compress the rest of the sentence into a low tail.

Why Japanese seems flat to English ears

English often combines loudness, length, and pitch into a single perceived stress on a focused word. Japanese keeps these dimensions more separate. Its main focus cue is pitch-range geometry, the size of the F0 excursion across a whole phrase, rather than localized loudness.³⁴¹

Tokyo Japanese has no word stress in the English sense. The only word-level pitch event is a single H-to-L drop whose location is lexically fixed (the canonical 端 / 箸 / 橋 contrast), so emphasis cannot be realized by "stressing" a syllable the way English does.⁶⁷

Even native Japanese listeners identify focus from F0 alone with surprisingly low accuracy: 31% for final focus, up to 57% for neutral or broad focus. This confirms that focus prosody in Japanese is real, but it operates within a narrower pitch geometry than the English ear is calibrated for.⁸

Why this matters for L2 learners

In production studies of native-non-native pairs, L2 learners of Japanese "did not rely systematically on f0 nor duration cues" when identifying focus. That finding is the direct experimental counterpart of the "Japanese sounds flat" perception, and it is what the rest of this article is calibrated to fix.⁹

Two prosodic layers, again

Lexical pitch accent is encoded as a per-word H*+L drop. Its location distinguishes lexical items and is fixed in the dictionary.⁶¹⁰ Phrase- and utterance-level intonation is encoded as boundary tones (H%, L%, LH%, HL%) at the right edge of the accentual phrase and the intonational phrase, independent of the lexical drops inside the phrase.³¹¹¹²

Focus prosody rides on both layers. It does not insert new tones; it manipulates the pitch range of the accentual phrase that carries the focused element and compresses the range of all following accentual phrases.³²¹

Two examples make the layering visible. The first carries an accented word whose lexical H-to-L survives any focus pattern; the second carries an unaccented (heiban) word with no internal drop, so focus has to show up in range and post-focal compression.

雨あめが降ふりました。⁶
"It rained."

飴あめが好すきです。⁶
"I like candy."

The two-component focus pattern

Focus prosody in Tokyo Japanese has two reliable acoustic correlates: on-focus pitch-range expansion and post-focal compression. The same string can carry either or both, depending on lexical accent type and discourse context. Production studies document each component independently.¹²

Component 1: on-focus pitch-range expansion

The focused word keeps its lexical accent shape, but the H*+L excursion is amplified: the peak is higher and the following low is lower. This expands the vertical pitch range of the accentual phrase that carries the focus.¹² Lee and Xu's production study reports consistent expansion of F0 range on the focused item across speakers and focus positions in accented stimuli.¹

The size of the expansion depends on lexical accent type. Accented words can expand the H–L drop directly. Heiban (unaccented) words have no internal drop to amplify, so the on-focus expansion shows up mostly as a higher overall pitch register on the focused accentual phrase rather than as a sharper drop.¹²⁸

Lee, Chiu, and Xu's perception study confirms the production picture from the listener side. Focus identification was more accurate when the focused words were accented than when they were unaccented. All-accented sequences yielded the highest focus-identification accuracy (56%), compared with unaccented-accented-unaccented sequences (32%).⁸

The same sentence spoken under broad focus and then under narrow focus on the subject makes the expansion audible.

田中たなかさんがパンを買かいました。¹
"Tanaka bought bread." (broad focus, neutral range)

田中たなかさんがパンを買かいました。¹
"TANAKA bought bread." (range expansion on 田中さん)

Component 2: post-focal compression

Everything after the focused element is compressed into a narrow, low pitch band. Sugahara's dissertation labels this the "post-focus compression" of F0 movement and shows that it is the main cue distinguishing focused from non-focused versions of the same string.²

Maekawa's earlier production data on Tokyo Japanese wh-questions reported the same pattern. Even when the focused wh-phrase itself shows no F0 rise, the post-focal reduction was consistently observed. This establishes post-focal compression as the more reliable of the two components.²¹³

Lee and Xu's quantitative analysis finds that post-focus F0-range compression appears only in accented stimuli. In unaccented stimuli, focus is marked by minimum-F0 lowering. Post-focal mean F0 is significantly lower than in the neutral-focus baseline.¹¹⁴

The cue English-speaking learners miss most

English uses a localized loudness spike on the focused word and lets the post-focus tail return to a neutral declination. Japanese flattens and lowers the post-focus tail. If you ignore the tail and listen only for an emphasized word, you will miss most of the focus information in a Japanese sentence.¹⁹

Shifting the focus along the sentence moves the boundary between expanded and compressed regions.

田中たなかさんがパンを買かいました。¹
"TANAKA bought bread." (post-focal compression on がパンを買いました)

田中たなかさんが学校がっこうでパンを買かいました。¹
"Tanaka bought bread AT SCHOOL." (post-focal compression on パンを買いました)

田中たなかさんが学校がっこうでパンを買かいました。¹
"Tanaka bought BREAD at school." (post-focal compression on 買いました only)

What about pre-focal material?

Pre-focal accentual phrases are largely unchanged in F0 range. Lee and Xu find no consistent expansion of the pre-focal region across focus positions.¹ Igarashi's chapter notes that pre-focal accentual phrases retain their normal cumulative downstep and lexical-accent shapes. The focus effect is localized to the focused phrase and propagates only rightward as post-focal compression.⁴

Some experimental work reports mild raising of the pre-focal region for certain focus positions, notably penultimate focus, but Lee and Xu treat the effect as position-dependent rather than as a general property of pre-focal material.¹

田中たなかさんがパンを買かいました。¹
"Tanaka bought BREAD." (pre-focal 田中さんが keeps its neutral shape; only the post-focus tail compresses)

A worked example with three focus placements

The standard worked example in the production literature is a transitive sentence with a clear subject-locative-object-verb order. The same sentence is recorded under three focus conditions, with focus on each major argument in turn.¹² In each case the pitch trace shows the same lexical accent pattern on the focused phrase with vertical range expansion. Everything to the right is compressed into a narrow low band, and everything to the left is unchanged.¹²

田中たなかさんが学校がっこうでパンを買かいました。¹
"TANAKA bought bread at school." (wide range on 田中さん, compressed 学校でパンを買いました)

田中たなかさんが学校がっこうでパンを買かいました。¹
"Tanaka bought bread AT SCHOOL." (wide range on 学校で, compressed パンを買いました)

田中たなかさんが学校がっこうでパンを買かいました。¹
"Tanaka bought BREAD at school." (wide range on パン, compressed 買いました)

An accented focused word can amplify its existing H–L drop. A heiban focused word has no drop to amplify, so the on-focus cue is reduced. The post-focal compression cue then carries proportionally more of the listener's identification work.¹²⁸

Contrastive は vs. thematic は

The particle は supports two readings. Their semantics have been distinguished since at least Kuno's 1973 treatment, and their prosodic correlates are formulated explicitly in Heycock's Oxford Handbook chapter.¹⁵⁵ Thematic は is the unmarked ground-marker. Contrastive は is melodically prominent and carries the same two-component focus pattern described above.

Thematic は: the unmarked ground-marker

Thematic は picks an entity that is already established in the discourse, generic or anaphoric, and marks it as the ground against which the rest of the sentence (the comment) is asserted.¹⁵⁵¹⁶ Kuno's foundational analysis states that themes marked by は must be either generic or anaphoric. Non-anaphoric, non-generic referents are typically marked by が, not by は.¹⁵

Prosodically, thematic は is unmarked. The は-phrase carries its ordinary accentual-phrase shape. The comment that follows carries the focus prosody appropriate to the discourse, broad focus by default.⁵² Heycock characterizes non-contrastive (thematic) topics as showing neither the on-topic pitch peak nor the radical post-topic lowering that contrastive topics show.⁵

私わたしは学生がくせいです。⁵
"I'm a student."

田中たなかさんは医者いしゃです。⁵
"Tanaka is a doctor."

Contrastive は: pitch boost and post-particle drop

Contrastive は forces a sharp pitch peak on the は-marked element and a deep drop on what follows. Heycock formulates the prosodic signature as "the presence of a prominent high-pitch accent on some part of themselves and a radical lowering of the pitch accent of the phrases following them."⁵

This is structurally the same two-component pattern as focus prosody more generally: on-focus expansion plus post-focal compression, realized on the は-marked constituent.⁵² The convention has been recognized at least since Kuno's 1973 treatment. Kuno separated contrastive は (alternative-evoking, often translatable with "at least", "though", or implicit "but…") from thematic は.¹⁵⁵

The same written は supports both readings. Prosody and context are the only consistent disambiguators when mapping writing to speech.⁵¹⁶

私わたしは学生がくせいです。⁵
"As for ME, I'm a student (whatever the others are)."

寿司すしは食たべません。⁵
"Sushi, I don't eat (other things, maybe)."

田中たなかさんは来きます。⁵
"TANAKA is coming (but others, who knows)."

A test you can apply

Heycock's diagnostic is the two-component prosodic test: would a native speaker raise the pitch range on the は-phrase and drop everything that follows it? If yes, the reading is contrastive. If no, the reading is thematic.⁵

Oshima's complementary semantic test is the "as for X, at least…" paraphrase. A sentence that admits an "at least" or "as for" hedge is contrastive; a sentence that does not is thematic.¹⁶ Both tests pull on the same underlying contrast: contrastive は evokes alternatives, thematic は does not.¹⁶¹⁵

雨あめは降ふりませんでした。⁵
"It didn't rain (though something else may have happened)."

The next sentence is ambiguous in writing. The thematic reading is "Rain, I hate it," with no alternative evoked. The contrastive reading is "RAIN, I hate (other weather, OK)," with a pitch peak on 雨は and lowering on 嫌いです.

雨あめは嫌きらいです。⁵
"I hate the rain." (ambiguous between thematic and contrastive without prosody)

When は contrasts something not in the sentence

The alternative that contrastive は evokes does not need to be stated out loud. The contrast can be implicit, and the discourse fills in the unspoken alternative.¹⁵¹⁶ This is the source of the "as for X, at least…" reading. The speaker marks X as one of several possible alternatives without naming the others, and the hearer infers the contrast set from context.¹⁶

Oshima documents that the implicit-alternative reading is the majority case of は in actual corpus use, supporting his title-claim that は "most often does not mark a topic" in the unmarked thematic sense.¹⁶

ビールは飲のみます。¹⁶
"BEER I drink (wine, sake, who knows)."

今日きょうは早はやく帰かえります。¹⁶
"TODAY I'm going home early (unlike usual)."

Focus and the sentence-final particles

Focus prosody shapes the body of the utterance. Boundary tones shape the right edge. Because the two cues target different parts of the string, they can layer on the same sentence without interfering with each other.

Why focus and final-particle tunes don't fight

Focus prosody operates on the body of the utterance. It expands the range of the focused accentual phrase and compresses every accentual phrase after it. Boundary tones operate at the right edge, on the last mora or two of the accentual or intonational phrase.¹¹¹²⁴

The two layers are independent in the X-JToBI transcription system. The same string can carry any combination of focus prosody and final boundary tone (L%, LH%, H%, HL%) without their interfering with each other's primary cues.¹¹¹² The TUFS pronunciation module makes the same point for the question rise specifically: the rise is realized on the final mora only, while maintaining the accent patterns of the words in the sentence.¹⁷

田中たなかさんは来きましたよ。⁴
"TANAKA came." (focus on 田中さん; falling よ on a compressed tail)

田中たなかさんは来きましたか。¹⁸
"Did TANAKA come?" (focus on 田中さん; LH% rises on か itself, post-focal compression on the body)

Focus before ね, よ, よね

When the focus is mid-sentence, post-focal compression carries all the way to the sentence-final particle. The particle's own boundary tune (rise on ね, fall on よ, fused tune on よね) sits on top of a low, flat tail rather than on a neutral declination.⁴¹²

The boundary pitch movement remains identifiable as a tonal event because it targets the final mora only, distinct from the compressed body of the utterance.¹¹¹²¹⁷ The practical consequence is that the particle's tune is unchanged in shape but realized at a lower absolute pitch than in the broad-focus version of the same sentence.⁴

田中たなかさんが来きましたね。⁴
"TANAKA came, didn't he?" (small rise on ね sits on a compressed tail)

違ちがいますよ。¹⁹
"That's WRONG." (focus on 違います; よ falls on a compressed tail)

Focus on the particle itself

The sentence-final particle can itself be the focused element in narrow contexts, most often when the speaker is overriding a prior claim. The よ particle in 違うよ can take its own prominence, with a pitch boost on the particle morpheme.¹⁹⁴

Hirayama's account of rising declaratives notes that particles like よ vary their tune (yo-falling versus yo-rising) to convey distinct discourse moves on the same morpheme, which presupposes that the particle is a target for prosodic prominence rather than a tonally fixed segment.¹⁹

違ちがうよ。¹⁹
"I'm telling you, it's wrong."

行いきますよ。¹⁹
"I AM going."

Cross-reference the dedicated intonation pages

The dedicated intonation sibling pages lay out the boundary-tone inventory used throughout this section: the rises and falls on the final mora, the LH% on yes/no questions, and the L% on assertive よ. Two are already published: Japanese Sentence Intonation: Falls, Rises, ね, よ, よね, which covers the boundary-tone inventory and the politeness and discourse functions of the sentence-final particles; and Japanese Questions Without か: Rising Intonation and の, which covers the question rise on declarative-shaped strings. The X-JToBI labels referenced here are consistent with the inventories used there.¹²¹¹

When focus is not pitch-boosted

Pitch boost is the prototypical focus cue, but it has exceptions. Two well-documented situations weaken or remove it: certain question types, and lexical-accent contexts where there is no H–L drop to amplify. In those situations Japanese uses secondary cues, including boundary insertion and segmental lengthening. Post-focal compression remains the more reliable signal.

The why-question case

Tomioka reports a surprising prosodic pattern in Japanese why-questions. The phrase that immediately follows a causal wh-phrase (the focus associate of なぜ or どうして) "can be considered as the focus associate without any focal prominence," contradicting the otherwise general claim that a focused phrase in Japanese receives a pitch boost.²⁰

The earlier Maekawa 1991 production data, cited by Sugahara, reported the same effect for wh-questions more broadly: no significant F0-rise on the wh-phrase, while the post-focal reduction was consistently observed.² Ishihara's ICPhS 2011 paper isolates the case where the wh-phrase is lexically unaccented and shows that on-focus expansion is reduced or absent there too. The post-focal compression cue still operates and remains the more reliable side of the two-component pattern.¹³

The take-away for learners is that pitch boost is the typical cue, but not an exceptionless one. A wh-question without the expected on-wh pitch rise is not a defective question. It is a known sub-pattern of focus prosody.²⁰¹³

なぜ田中たなかさんは来こなかったの。²⁰
"Why didn't Tanaka come?" (focus associate is 田中さんは, no pitch boost on it)

誰だれがパンを買かいましたか。¹³
"Who bought bread?" (heiban 誰 carries little on-focus expansion; robust post-focal compression on パンを買いました)

Boundary insertion as a backup cue

Imai, Lee, and Xu's production study documents an "Edge-Reinforcing Strategy." When pitch-range expansion is unavailable or blocked (already-low pitch register, post-accentual context, dialectal mismatch), Tokyo-Japanese speakers signal focus through edge-reinforcing cues, including silence, segmental lengthening, and jaw-opening at prosodic boundaries.²¹

The study tested nine educated Tokyo-Japanese speakers producing genitive noun phrases under broad versus narrow focus. Acoustic measures included word duration and jaw-opening estimates. The sample is small, and the strategy as described has not yet been replicated outside that dataset, but speakers in the study reliably restructured prosodic domains through these boundary-based cues.²¹

If you listen only for pitch contour, you can miss focus entirely in stretches of speech where the F0-range cue is suppressed. A small prosodic break before or after the focused element is a separate, secondary cue.²¹ Earlier work converges from the F0 side: focus in unaccented stimuli is marked by minimum-F0 lowering rather than by an expanded HL excursion, showing that the system has more than one cue available.¹¹⁴

田中たなかさんの本ほんです。²¹
"It's Tanaka's book." (broad focus)

田中たなかさんの本ほんです。²¹
"It's TANAKA'S book." (narrow focus on 田中さん; lengthening on the genitive boundary, jaw-opening cue)

Downstep and accent type can mute the cue

On a heiban (unaccented) phrase there is no internal H–L drop to amplify, so the on-focus expansion cue reduces to a register raise rather than a sharper excursion. The post-focal compression and boundary cues become proportionally more important.¹²⁸

Within a single intonational phrase, cumulative downstep lowers each subsequent accentual phrase relative to the previous one. An accented focus near the right edge of a long intonational phrase has less pitch headroom for expansion than one near the left edge.³¹¹

Lee, Chiu, and Xu's perception study quantifies the resulting asymmetry. Focus identification accuracy is lowest for final focus (31%) and highest for neutral or broad focus (57%), with accented focus consistently easier to identify than unaccented focus.⁸ The pedagogical consequence is that the same focus instruction, "widen the pitch on X," works well for an accented X near the left edge of a sentence but poorly for a heiban X near the right edge. In the harder case, the learner has to lean on post-focal compression and boundary cues rather than on range expansion.⁸²¹

Focus condition	Lexical accent type	Position	Identification accuracy
Neutral / broad focus	mixed	n/a	57%
Narrow focus, accented sequence	accented	non-final	56%
Narrow focus, unaccented-accented-unaccented sequence	mixed (target accented)	medial	32%
Narrow focus, final	mixed	final	31%

Source: Lee, Chiu, and Xu's perception study.⁸

田中たなかさんが歌うたを歌うたいました。⁸
"Tanaka sang a song." (broad focus; accented 田中さん, heiban 歌)

田中たなかさんが歌うたを歌うたいました。⁸
"Tanaka sang a SONG." (focus on heiban 歌; no internal HL to amplify; register raise plus post-focal compression on 歌いました do the work)

Good to know

Loudness instead of pitch geometry

English speakers tend to mark a focused word by raising volume on it, the way they would say "TANAKA bought bread." In Japanese the right move is the opposite of localized loudness: widen the pitch range on the focused accentual phrase and compress the following phrases into a narrow low band, without changing volume.

Japanese focus prosody is realized in F0 range and post-focal compression, not in loudness. Loud emphasis without pitch geometry is heard as anger, not as emphasis.¹²⁹ The corrected version of the example sentence keeps the volume even and lets pitch do the work.

田中たなかさんがパンを買かいました。¹
"TANAKA bought bread." (range expansion on 田中さん, compression on がパンを買いました; volume steady)

Over-applying contrastive は

A common beginner habit is to produce every は with a contrastive pitch peak and post-particle lowering, even when the discourse is unmarked. The default reading of は is thematic and prosodically flat. Contrastive prosody belongs only where alternatives are evoked. Producing every は as contrastive sounds like the speaker is constantly correcting the listener.⁵¹⁶ The thematic version of the canonical example carries no on-particle peak.

私わたしは学生がくせいです。⁵
"I'm a student." (thematic は, no pitch peak on 私は)

"Boost the bit, squash the rest."

A five-word summary of the two-component pattern: on-focus pitch-range expansion plus post-focal compression. The phrase captures the asymmetry between the focused accentual phrase and everything that follows it. It also works well as a self-coaching cue while practicing.¹²

Post-focal compression weakens in fast casual speech

Post-focal compression is most reliable in careful or news-register speech. In fast colloquial speech, speakers may neutralize the F0 cue and rely more on word order, particle choice, or edge-reinforcing boundary cues.²¹¹ If you train exclusively on news audio, you will hear the cue cleanly. In spontaneous conversation, the same cue may be partially absent, so recognition has to lean on the other components.

焦点 (shōten) "focus"

焦点 in Japanese means "focal point." It was originally a physics term for the focal point of a lens or mirror, and was adopted into Japanese linguistics as the standard translation for what English-language linguistics calls "focus." Both languages converged on the same optical metaphor for "the salient point of an utterance."⁴

Treating OJAD output as a focus-prosody guide

OJAD (Online Japanese Accent Dictionary) is a lexical-pitch-accent tool. Its rendered pitch contour reflects word-level accent patterns and accentual-phrase boundaries, not sentence-level focus prosody. Sentence-level focus prosody is invisible to OJAD's audio synthesis. For ear-training, use connected speech with varying focus placement, not isolated word lists.²²

References

Lee, Albert, and Yi Xu. "Revisiting Focus Prosody in Japanese." In Proceedings of Speech Prosody 2012, Shanghai, pp. 274–277. https://doi.org/10.21437/SpeechProsody.2012-70 ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷ ↩¹⁸ ↩¹⁹ ↩²⁰ ↩²¹ ↩²² ↩²³ ↩²⁴ ↩²⁵ ↩²⁶ ↩²⁷ ↩²⁸ ↩²⁹ ↩³⁰ ↩³¹ ↩³²
Sugahara, Mariko. "Downtrends and Post-Focus Intonation in Tokyo Japanese." Ph.D. dissertation, University of Massachusetts Amherst, 2003. https://scholarworks.umass.edu/dissertations/AAI3079597/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷ ↩¹⁸
Pierrehumbert, Janet B., and Mary E. Beckman. Japanese Tone Structure (Linguistic Inquiry Monograph 15). MIT Press, 1988. https://mitpress.mit.edu/9780262660631/japanese-tone-structure/ ↩ ↩² ↩³ ↩⁴ ↩⁵
Igarashi, Yosuke. "Intonation." In Haruo Kubozono (ed.), Handbook of Japanese Phonetics and Phonology (Handbooks of Japanese Language and Linguistics 2), De Gruyter Mouton, 2015. https://www.degruyter.com/document/doi/10.1515/9781614511984/html ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹
Heycock, Caroline. "Japanese -wa, -ga, and Information Structure." In Shigeru Miyagawa and Mamoru Saito (eds.), The Oxford Handbook of Japanese Linguistics, Oxford University Press, 2008. Author pre-print: https://www.lel.ed.ac.uk/~heycock/papers/topic-draft.pdf ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷ ↩¹⁸ ↩¹⁹
Kawahara, Shigeto. "The Phonology of Japanese Accent." In Haruo Kubozono (ed.), Handbook of Japanese Phonetics and Phonology, De Gruyter Mouton, 2015, ch. 11. Author pre-print: https://user.keio.ac.jp/~kawahara/pdf/HandbookAccentPublished.pdf ↩ ↩² ↩³ ↩⁴
Vance, Timothy J. The Sounds of Japanese. Cambridge University Press, 2008. https://www.cambridge.org/9780521617543 ↩
Lee, Albert, Fei Chiu, and Yi Xu. "Focus Perception in Japanese: Effects of Lexical Accent and Focus Location." PLOS ONE 17(9): e0274176, 2022. https://doi.org/10.1371/journal.pone.0274176 ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰
Zhang, Yufan, Xi Chen, Si Chen, Yuan Meng, and Albert Lee. "Visual-Auditory Perception of Prosodic Focus in Japanese by Native and Non-Native Speakers." Frontiers in Human Neuroscience 17: 1237395, 2023. https://doi.org/10.3389/fnhum.2023.1237395 ↩ ↩² ↩³
NHK放送文化研究所 (NHK Broadcasting Culture Research Institute), ed. NHK日本語発音アクセント新辞典. 日本放送出版協会, 2016. https://www.monokakido.jp/ja/dictionaries/nhkaccent2/index.html ↩
Venditti, Jennifer J. "The JToBI Model of Japanese Intonation." In Sun-Ah Jun (ed.), _Prosodic Typology: The Phonology of Intonation and Phrasing, Oxford University Press, 2005, ch. 7. Pre-print: http://www.cs.columbia.edu/~jjv/pubs/jtobi-webversion.doc ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
Maekawa, Kikuo, Hideaki Kikuchi, Yosuke Igarashi, and Jennifer Venditti. "X-JToBI: An Extended J-ToBI for Spontaneous Speech." In Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP / Interspeech), Denver, 2002. https://www.isca-archive.org/icslp_2002/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
Ishihara, Shinichiro. "Focus Prosody in Tokyo Japanese Wh-Questions with Lexically Unaccented Wh-Phrases." In Proceedings of the 17th International Congress of Phonetic Sciences (ICPhS XVII), Hong Kong, 17–21 August 2011, pp. 950–953. https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2011/OnlineProceedings/RegularSession/Ishihara,%20Shinichiro/Ishihara,%20Shinichiro.pdf ↩ ↩² ↩³ ↩⁴
Lee, Albert, and Yi Xu. "Conditional Realisation of Post-Focus Compression in Japanese." In Proceedings of the International Conference on Phonetics and Phonology (ICPP), NINJAL, 2015. http://www.homepages.ucl.ac.uk/~uclyyix/yispapers/Lee_Xu_SP2018.pdf ↩ ↩²
Kuno, Susumu. The Structure of the Japanese Language. MIT Press, 1973. https://mitpress.mit.edu/9780262110495/the-structure-of-the-japanese-language/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
Oshima, David Y. "The Japanese Particle wa Most Often Does Not Mark a Topic." In Japanese/Korean Linguistics 28, CSLI Publications, Stanford University. https://web.stanford.edu/group/cslipublications/cslipublications/ja-ko-contents/JK28/poster06.pdf ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰
Tokyo University of Foreign Studies (東京外国語大学), Language Modules, Japanese Pronunciation, Practical Edition, §1.9.1 イントネーション基礎. https://www.coelang.tufs.ac.jp/mt/ja/pmod/practical/01-09-01.php ↩ ↩²
Ishihara, Shinichiro. "The Intonation of Wh- and Yes/No-Questions in Tokyo Japanese." In Chungmin Lee, Ferenc Kiefer, and Manfred Krifka (eds.), Contrastiveness in Information Structure, Alternatives and Scalar Implicatures, Springer, 2017, pp. 339–415. https://doi.org/10.1007/978-3-319-10106-4_19 ↩
Hirayama, Hitomi. "Rising Declaratives in Japanese." In Japanese/Korean Linguistics 29, CSLI Publications, Stanford University. https://web.stanford.edu/group/cslipublications/cslipublications/site/JKONLINE/29/CH12.pdf ↩ ↩² ↩³ ↩⁴ ↩⁵
Tomioka, Satoshi. "Focus Without Pitch Boost: Focus Sensitivity in Japanese Why-Questions and Its Theoretical Implications." Journal of East Asian Linguistics 31(1): 73–98, 2022. https://doi.org/10.1007/s10831-022-09235-5 ↩ ↩² ↩³
Imai, Sayoko, Albert Lee, and Yi Xu. "When Pitch Falls Short: Reinforcing Prosodic Boundaries to Signal Focus in Japanese." Languages 10, no. 9 (2025): article 242. https://www.mdpi.com/2226-471X/10/9/242 ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
Suzuki, Yuka, OJAD Development Team. Online Japanese Accent Dictionary (OJAD). The University of Tokyo. https://www.gavo.t.u-tokyo.ac.jp/ojad/eng (limitation: tool documentation, not a primary linguistic source) ↩

Overview​

What "focus" means in linguistics, in one paragraph​

Why Japanese seems flat to English ears​

Two prosodic layers, again​

The two-component focus pattern​

Component 1: on-focus pitch-range expansion​

Component 2: post-focal compression​

What about pre-focal material?​

A worked example with three focus placements​

Contrastive は vs. thematic は​

Thematic は: the unmarked ground-marker​

Contrastive は: pitch boost and post-particle drop​

A test you can apply​

When は contrasts something not in the sentence​

Focus and the sentence-final particles​

Why focus and final-particle tunes don't fight​

Focus before ね, よ, よね​

Focus on the particle itself​

Cross-reference the dedicated intonation pages​

When focus is not pitch-boosted​

The why-question case​

Boundary insertion as a backup cue​

Downstep and accent type can mute the cue​

Good to know​

Loudness instead of pitch geometry​

Over-applying contrastive は​

"Boost the bit, squash the rest."​

Post-focal compression weakens in fast casual speech​

焦点 (shōten) "focus"​

Treating OJAD output as a focus-prosody guide​

See also​

References​

Footnotes​