Sentence-Level Prosody Practice in Japanese: Drilling Whole-Sentence Pitch Contours
Sentence-level prosody practice is the production drill that turns clean individual sounds and correct word pitch accent into the pitch contour of a whole utterance. That contour is the rise and fall that carries meaning, much as stress does in English.1 A learner can pronounce every mora correctly and still sound foreign once the words become a sentence. The melody of the sentence is a separate layer from the accent of each word.2
This article gives you a self-runnable routine for drilling whole-sentence contours. It builds on word-level pitch accent and intonation theory rather than re-teaching them. You will produce four contour targets: the declarative fall, the rising question, focus and emphasis widening, and the list or continuation rise.13
Overview: From Word Accent to Sentence Contour
Word-level pitch accent tells you the shape of a single word in isolation. Sentence prosody tells you the shape of the whole utterance those words sit inside.42 The two layers run at the same time, and a natural-sounding sentence needs both.
The four model sentences and their variants below are drill material, transparently built from N5 vocabulary and grammar so the only hard part is the prosody, not the words.5
What sentence prosody adds on top of word pitch accent
Japanese pitch is organized at two layers. The lexical layer is word pitch accent: each word carries a specified accent pattern, an accent nucleus (核) or none, realized as a fall from high to low after the accented mora.42
The sentence layer is intonation: phrasing, declination (the gradual lowering of pitch), downstep, focus scaling, and the pitch movements at phrase and utterance edges.1 Word accent supplies the local fall on each word; sentence intonation supplies the overall shape those local falls ride on.21
One interaction matters for production. Downstep is triggered by lexical accent. Each accented word lowers the top of the pitch range for what follows in the same phrase. This lowering is called catathesis, and it means the staircase you hear across a sentence is partly built out of the word-level accents themselves.26
Downstep means the descending staircase across a statement is not produced by a separate sentence-melody gesture alone. It is triggered by the H* of each accented word, and it appears only when the accented word and the material after it sit in the same major phrase.26 When you drill a contour, you are partly drilling the cumulative effect of the words' own accents.
Downstep is distinct from declination, the gradual global lowering of pitch across an utterance that happens regardless of where the accents fall.23 A natural statement does both at once: a slow global drift downward (declination) plus discrete step-downs at each accented word (downstep).
The mechanics of this model belong to the pronunciation-category theory articles. This drill states the two-layer distinction and the one load-bearing interaction, then points onward.
The accent and intonation described here are standard Tokyo Japanese (共通語), the variety codified by the NHK accent dictionary and generated by the OJAD tool introduced below.78 Dialects such as Kansai and Kagoshima have different lexical accent and different focus prosody; this drill targets the standard.3
The four contours you will drill
Each production target maps onto a well-described intonational event.13
- Declarative fall. Declination across the utterance ending in a low, falling final boundary, with no rising movement at the end.1
- Rising question. A rising movement at the utterance edge, used for yes/no questions whether or not か is present.41
- Focus and emphasis widening. Local expansion of the pitch range on the focused phrase, with compression of the range after it.3
- List and continuation rise. A non-final rise that signals "more is coming," distinct from the final fall, at clause-internal boundaries.1
These four become the subsections of the drill section below. Each is a separately labelable event in the standard intonation frameworks, so the four-way split is sound. Each target also has its own phonetic signature to listen for.13
Where this fits in the pronunciation-priority order
Prosody is a sentence-level skill that depends on usable segments and usable mora timing. Secure your individual sounds and your rhythm before drilling whole-sentence melody, because an unstable rhythmic base makes the contour hard to control.4
The full priority ordering belongs to the speaking-category priority article. The line here is simply that sentence melody comes after sounds and rhythm are usable, not before.
How to Drill: The Sentence-Prosody Loop
The drill is a four-step loop: get a model contour, shadow it for melody, record and compare, then isolate whatever broke. The diagram below shows the loop and its repair branch.
Step 1: Get a model contour (native audio or OJAD/Suzuki-kun)
A model contour comes from one of two sources: native model audio, or a tool that draws the pitch line for you.
OJAD's Suzuki-kun, the "Prosodic Reading Tutor of Japanese," takes any Japanese text, converts it to its kana reading, and shows the natural standard Tokyo pitch pattern as a smooth pitch curve. It marks accent-nucleus positions and devoiced vowels, and it can also read the text aloud following that curve.8 Suzuki-kun was added to OJAD in 2014, is hosted by the University of Tokyo, and was built as an educational prosody tool. Reported experiments found the visualized pitch curve helped learners more than model utterances alone.8
The tool draws its curve from the sentence's actual accents. The line you imitate already encodes the word-level accents plus the downstep staircase and declination, combining the lexical and sentence layers into a single contour.82
Native model audio is the other model source. Slowing playback to around 0.75× lets you hear where the line rises and where it falls without the contour collapsing at speed.4 Use the slowed version to map the melody, then bring it back to full speed once you can hear the shape.
Suzuki-kun generates standard Tokyo prosody only. It models the codified variety, not any one speaker's idiolect (personal speech pattern) and not dialect prosody.87
Step 2: Shadow the whole sentence for contour, not words
Apply shadowing at the whole-sentence scale: reproduce the melody and timing of the model first, and treat the accuracy of every individual mora as second.1 Sentence meaning and naturalness ride on sentence-level intonation, not on the segments alone, so lock in the melody first.1
The shadowing technique itself belongs to the listening-category shadowing article. This step applies that technique to a full sentence; it does not re-teach it.
Step 3: Record and compare the whole sentence
Record yourself producing the sentence, then compare it with the model. Listen for places where your contour flattened, or where it rose when it should have fallen.
This is a procedural step, not a linguistic claim. The record-listen-compare loop belongs to the dedicated record-and-compare drill; here it is applied at sentence scale. Why recording is non-negotiable is covered in the Good to know section below.
Step 4: Isolate the contour that broke
When a whole sentence comes out wrong, do not just repeat it and hope. Identify which of the four contours failed. Re-drill that one target on its own, then return it to the full sentence.
This is a diagnostic move. The four contours it sends you back to are each drilled separately in the next section.
The Four Contours, Drilled
Each contour below has its own sample sentences and its own listening target. The sentences are constructed drill material, built from cited N5 words, not native quotes.5 The four-contour comparison is summarized here, then drilled one target at a time.
| Contour | Trigger | Pitch shape | Sample |
|---|---|---|---|
| Declarative fall | Statement, finished utterance | Declination across the sentence, low fall at the end | 今日は学校に行きます。 |
| Rising question | Yes/no question, with or without か | Rising movement at the utterance edge, scooped on the final | 学校に行きますか。 |
| Focus and emphasis | Contrast or new information on a phrase | Wider, higher peak on the focused word, compressed tail | 私がコーヒーを飲みます。 |
| List and continuation | Clause that is not the end of the utterance | Non-final rise or held high, "more is coming" | 朝ごはんを食べて、学校に行きます。 |
Declarative fall: the default statement contour
A neutral Japanese statement shows declination: a gradual lowering of pitch across the whole utterance. It also ends in a low final boundary with a final fall.1 Within that global slope, each accented word steps the top line down. The statement is a descending staircase, not a flat line that drops only at the very end.26
今日は学校に行きます。5
"I'm going to school today."
The contour target here is declination plus a clear final fall on 〜ます, with no rise at the end.1
田中さんは先生です。5
"Mr. Tanaka is a teacher."
This copular statement ends on です, which lets you hear the final fall cleanly because the tail is the unaccented-then-falling です.1
毎日コーヒーを飲みます。5
"I drink coffee every day."
A longer statement lets you feel the declination develop across more morae before the final fall.12
A learner who keeps pitch flat and level across the sentence, with no declination and no final fall, sounds unnatural because the expected falling envelope is missing. The absent final fall also removes the cue that the utterance is a finished statement.41
All three are polite-form (です/ます) statements, the J-Compass default, built from high-frequency everyday vocabulary.9 Their plain-form variants are drilled under "Polite vs. casual contours" below.
Rising question: 〜か and the bare rising 〜の / no-particle question
A yes/no question is marked by a rising movement at the utterance edge.1 With the question particle か, the rise is carried on or after か. In casual speech a question can be formed by rising intonation alone on a plain-form sentence with no か, and the rising-の question is a further casual option.41
The question rise has a recognizable acoustic signature: it tends to be scooped (concave), reaches a higher peak than other rises such as prominence or continuation rises, and begins well within the final vowel.1 That scooped, high rise is the cue you are trying to produce.
学校に行きますか。5
"Are you going to school?"
This is the declarative model plus か. Drill the rising movement on or after か, and keep it modest in this polite register.14
田中さんは先生ですか。5
"Is Mr. Tanaka a teacher?"
Pair this with the declarative version above for back-to-back drilling: the same words, fall versus rise.1
コーヒー、飲む?5
"Coffee? / Want some coffee?"
This casual plain-form question carries no か at all. Questionhood rides on a large, concave terminal rise on 飲む alone.41
In formal and polite speech the か-question rise can be relatively small. A polite 〜ますか needs only a modest terminal rise to cue a question, not the large rise English speakers tend to default to.4 The rise is obligatory enough to signal a question, but it does not have to be steep.
The bare rising question with no か is casual; in polite or written contexts か is expected. The の-question carries its own nuance. That detail is covered in the pronunciation-category rising-question article, which is cross-linked rather than expanded here.4
Focus and emphasis: widening the pitch on the focused phrase
Focus, whether contrastive or new-information emphasis, is produced by expanding the pitch range on the focused phrase, chiefly by raising its accent peak. The pitch range of everything after it is compressed.3 The focused element is relatively higher and wider while the post-focal material is lower, so the listener hears one prominent peak rather than an even contour.3
The cue is not loudness. It is a wider, higher pitch movement on the focused phrase followed by a flatter, lower tail.3
私がコーヒーを飲みます。5
"I'm the one who'll drink the coffee."
Focus falls on the が-marked subject 私. Widen the pitch on 私が and compress the post-focal 〜飲みます.3
コーヒーを飲みます。5
"It's coffee I'll drink (not tea)."
Here the contrast lands on the object コーヒー: a range-expanded peak on コーヒー, then post-focal compression of 〜を飲みます.3
The widen-then-compress cue described here applies cleanly because both focused words in these drills, 私 and コーヒー, are accented. An unaccented focused word is marked differently, for example by lowering the following minimum, so the effect interacts with the word's own lexical accent.3 Do not generalize the post-focal-compression cue to unaccented words. That complication is covered in the pronunciation-category focus-prosody article.
The focus contour is independent of politeness and rides on either です/ます or plain form.3
Lists and continuation: the rise-and-hold before て / が / けど
A clause that is not the end of the utterance takes a non-final continuation rise rather than a final fall. This signals that more is coming. The continuation rise sits at a phrase-internal boundary and is distinct in function from the question rise. It commonly appears at list-item boundaries and before continuative connectives such as the て-form, が, and けど.1
朝ごはんを食べて、学校に行きます。5
"I eat breakfast and then go to school."
The first clause does not fall: produce a continuation rise or a held, un-fallen boundary on 〜食べて before the final fall on 〜行きます.1
コーヒーを飲みますが、お茶は飲みません。5
"I drink coffee, but I don't drink tea."
Sustain a non-fall on 〜ますが before the final falling 〜ません. This sentence also sets up a contrastive は, which connects back to the focus contour.13
Producing a final fall where a continuation rise belongs makes a long sentence sound like a string of separate finished statements. The continuation rise is what binds the clauses into one utterance.1
The continuation rise appears in both polite and casual speech and is highly frequent in connected discourse.9 The full inventory of boundary pitch movements is covered in the pronunciation-category sentence-intonation article.1
Nuance and usage contexts
Sentence-final particles change the tail contour
Sentence-final particles carry their own short contours at the very end of the utterance. ね typically takes a rise when it seeks agreement or confirmation, and can take a fall when it expresses shared feeling. よ typically asserts new information and is often produced with a fall or a sharper movement marking emphasis or notification.41
Treat these as typical defaults, not rules. The exact melody varies with the speaker's stance. The mapping between these particles' meanings and their intonation is an active research area, so the drill teaches one typical contour for each.1
今日は寒いですね。5
"It's cold today, isn't it?"
The target is the confirmation-seeking rising ね on the tail.41
この本は面白いですよ。5
"This book is interesting, you know."
The target is the asserting, notifying よ on the tail, typically a fall or an emphatic movement.41
Because the particle sits at the very edge, its contour is the last thing the listener hears and strongly colors the utterance's stance. A falling ね where a rising confirmation-seeking ね is meant changes the pragmatic reading.4 That is why the tail is worth drilling on its own.
ね and よ are extremely high-frequency in conversation and work in both polite and casual registers. The fine-grained particle-contour inventory, including よね, is covered in the pronunciation-category sentence-intonation article.1
Polite vs. casual contours
です/ます statements and plain-form statements carry slightly different default tails and different question strategies. Polite statements end on the falling 〜ます or です, and questions use か with a modest rise.4 Casual plain-form statements end on the bare verb, adjective, or copula. Yes/no questions are very commonly formed by rising intonation alone with no か, or with rising の.41
学校に行きます。/ 学校に行く。5
"I'm going to school." (polite / casual)
Both take a final fall, but the casual plain form ends on the accented 行く. Pair them to feel the same declarative fall land on different tails.14
行きますか。/ 行く?5
"Are you going?" (polite / casual)
Here the contrast is sharp: a polite modest rise on か versus a large casual bare rise on 行く with no particle.41
The two registers differ prosodically, not just lexically. The same content gets a different terminal melody depending on register, which is why the drill pairs polite and casual versions of the same sentence.4 J-Compass keeps the polite default. The casual forms are drilled as the contrast.
Speed and connected speech blur the textbook contour
At natural speaking rate the idealized contour is compressed. Declination continues across the utterance, so later phrases sit lower. Downstep keeps stepping the top line down at each accent, and reductions and devoicing shorten the material. As a result, the clean textbook rises and falls become smaller and run together.23
The practical consequence is methodological: drill the contour slowly first, where each rise and fall is large and audible. Then re-drill at native tempo so you can produce the compressed-but-still-correct version. The connected-speech detail is covered in the pronunciation-category speech-rate article.
The degree of initial lowering and the size of declination and downstep vary with phrase structure and speech rate, so the same sentence can show a flatter contour in fast speech than in careful speech.102
Good to know
Pitfall: copying the words but flattening the melody
The most common failure is reproducing the segments correctly while losing the contour. A learner with clean sounds but a level, flat sentence still sounds non-native. Sentence meaning and naturalness ride on the intonation (declination, phrasing, the final boundary), not on correct segments alone.14
The fix is to produce the same sentence with declination across it and a clear final fall on the ending.
今日は学校に行きます。5
"I'm going to school today."
Shadow the tune, not just the segments.
Pitfall: turning every sentence into a question
English speakers whose first language allows declarative "uptalk" tend to transfer a terminal rise onto Japanese statements. In Japanese, a yes/no question is cued by a rising movement at the edge, and a statement falls. An unintended terminal rise can therefore mismark a statement as a question or as uncertainty.14
The correct version of the statement falls at the end; reserve the rise for when a question is actually intended.
学校に行きます。5
"I'm going to school."
Produce that with the declarative final fall, and add the rising か-question contour only when you mean to ask.
Mnemonic: hum the sentence first
Hum the contour of the sentence with no words before adding the words back. Humming isolates the melody from articulation. That lets you fix the rise-and-fall shape independently of getting every mora right, then re-attach the words.
This leans on the same separation Suzuki-kun makes visible by drawing the pitch line apart from the text.81
Why your own ear lies to you on whole sentences
Self-perception of your own second-language prosody is unreliable. The load of producing a whole sentence makes real-time self-monitoring worse than at single-word length. You can believe your contour fell when it actually stayed flat.
That is why recording is non-negotiable: it externalizes the contour so you judge it as a listener instead of as a producer. Treat this as a qualitative warning, not a measured figure, and route the full loop to the dedicated record-and-compare drill.
See also
- Japanese Sentence Intonation: Falls, Rises, ね, よ, よね
- Japanese Focus Prosody: Pitch Widening, Contrastive は, and Information Structure
- Japanese Questions Without か: The Rising-Intonation Question and the の Alternative
- Record-and-Compare: The Self-Correction Loop for Japanese Pronunciation
- Mora-Timing Drills for Japanese: Beating English Stress-Timing
- Japanese Pitch-Accent Minimal Pairs: The Drill List You Must Hear