What Is Shadowing? The Listening-and-Speaking Technique, Explained

Shadowing means tracking heard speech and reproducing it aloud as exactly as possible in real time. The audio keeps playing, and you do not look at a text.¹² For a Japanese learner, it is one of the few drills that trains listening and pronunciation at the same time rather than separately.

Overview

The defining feature of shadowing is simultaneity. You do not wait for the speaker to finish and then repeat from memory; you speak over the continuing audio, lagging only a beat behind, with no script in front of you.²¹

That single constraint separates it from the more familiar techniques it is often confused with. It also gives shadowing its particular cognitive workout. The sections below define the technique, distinguish it from nearby drills, explain the mechanism scholars propose for why it works, and summarize what the research can and cannot yet claim.

This article is the conceptual home for shadowing as a method. Related articles cover choosing material by level, the text-supported variant called overlapping, and the related dictation drill of transcription.

What Shadowing Is

Shadowing means reproducing speech aloud as it arrives, in real time, with the smallest practical lag and no text in view. The neuroimaging study by Takeuchi and colleagues defines it as a task in which the learner "tracks the heard speech and repeats it as exactly as possible while listening attentively to the incoming contextual information."¹

Hamada frames the same idea from the teaching side: learners repeat heard input with the smallest possible delay rather than waiting for the speaker to stop.² The audio never pauses for you. You stay close enough behind it that listening and speaking happen at once.

The lag is short by design. In the experimental psycholinguistics literature where the technique began, close shadowing occurs within roughly 250 milliseconds of the input. Reported minimum delays are around 150 milliseconds.³

Treat "one second behind" as a rule of thumb, not a constant

Language-learning guidance often says to shadow "about one second behind" the audio. That figure is a looser, learner-facing approximation of the near-simultaneity measured in the lab, not a precise target.³ The key point is that you stay close enough that the audio is still playing, not that the gap is exactly one second.

A shadowing target is simply whatever the chosen audio says. There is no special sentence to memorize first. In practice, press play, wait until the speaker is about one beat ahead, then say exactly what they say at the same time. Copy their rhythm and intonation without looking at any transcript.

Where the technique came from

Shadowing did not start as a teaching method. It originated in the late 1950s as a psycholinguistics research tool. The Leningrad group of Chistovich and Kozhevnikov developed it to measure the time gap between perceiving and articulating speech. Researchers later used it in attention and word-recognition studies.³

Its adoption as a deliberate second-language teaching technique is more recent. That is the focus of the work by Kadota and Hamada that this article draws on.⁴⁵

How Shadowing Differs from Parroting and Overlapping

Two everyday drills sit right next to shadowing and are often mistaken for it. They differ on two clear axes: when you speak relative to the audio, and whether a text is in front of you.

The first axis is the time lag. Shadowing and "repetition" (listen, pause, then repeat, often called parroting) both reproduce heard input. In shadowing, however, the gap between hearing and speaking is shorter, and the audio does not stop.² In parroting you wait for a pause or for the clip to finish and then speak into silence. In shadowing you speak over the continuing audio, so listening and speaking happen together rather than in turns.²⁵

That gap matters for teaching. Because parroting inserts a pause, it gives you time to mentally rehearse, translate, or reconstruct the utterance. Shadowing's near-simultaneity is meant to outrun that pause and force direct processing from ear to mouth.⁴

The second axis is whether text is present. Shadowing is done from audio alone, with no script. Reading a script aloud in time with the audio is a different task, commonly called overlapping, synchronized reading, or simply reading aloud.¹

Shadowing has no text in front of you

The moment a transcript is on the screen, you are reading, not shadowing. Takeuchi and colleagues separate the two at the level of processing: reading aloud "requires automatic visual phonetic coding and sentence recognition," whereas shadowing "requires automatic recognition of auditory speech."¹ In other words, the presence or absence of the written word changes which cognitive system the task trains.

Their neuroimaging data back this split both behaviorally and neurally. Both tasks improved working-memory performance. But shadowing training produced changes centered on the left cerebellum, part of the articulatory-rehearsal system, while reading-aloud training produced changes in right perisylvian regions. Same family, different neural substrate.¹

The table below summarizes the three drills. The full three-way comparison belongs to the related article on overlapping. Here, the point is only the two axes that define shadowing.

Drill	Text in front of you?	Timing relative to audio	What it primarily trains
Shadowing	No	Real time, a beat behind, audio keeps playing	Auditory recognition; perception coupled to production¹
Parroting / repetition	No	Listen, pause, then repeat into silence	Delayed recall and reconstruction²
Overlapping / reading aloud	Yes	Real time, but reading the script	Visual phonetic coding and sentence recognition¹

Why It Works: The Mechanism

The strongest part of the case for shadowing is not a single number. It is a coherent account of what the brain is doing during a repetition. Kadota, the principal scholarly authority on shadowing as a learning practice, models it as producing four overlapping effects.⁴

The input effect develops listening comprehension by forcing close attention to the incoming speech stream. The practice effect strengthens subvocal rehearsal, or silent inner repetition, in phonological working memory. This supports learning new words and set phrases. The output effect simulates stages of speech production because the learner is actually articulating rather than only listening. The monitoring effect develops metacognitive control through executive working memory, as the learner compares their own output to the model in real time.⁴

Underlying all four is one claim. Because you must reproduce speech faster than you can consciously parse and translate it into your first language, shadowing pushes processing toward the phonological loop. This is the auditory-rehearsal component of working memory. It avoids a slower route that sends everything through your native language.⁴

Listening: Training the Ear

The clearest empirical support is for shadowing improving bottom-up listening: parsing the speech stream by perception, such as recognizing phonemes and segmenting connected speech, rather than using top-down comprehension strategies.²⁶

In Hamada's controlled study of 43 Japanese university learners of English across nine lessons, phoneme perception improved in both proficiency groups.² This is the most replicated result and exactly the low-level perceptual skill the technique is theorized to train. Hamada's book-length treatment is framed explicitly around "developing learners' bottom-up skills." This reinforces that the listening gain shadowing best supports is perceptual processing, not inference or vocabulary knowledge.⁶

Speaking and Pronunciation: Training the Mouth

Because shadowing requires articulation, it also trains production. That is why it sits inside a broader Japanese pronunciation drill protocol. The systematic-review evidence is strongest for suprasegmental features, meaning prosody, rhythm, and fluency. It is also strong for global judgments such as comprehensibility, intelligibility, and accentedness, rather than for the accuracy of individual sounds.⁷

The 2025 systematic review of 44 studies concludes that shadowing training "can help improve learners' comprehensibility, intelligibility, and accentedness, as well as certain aspects of suprasegmental pronunciation control, such as fluency and prosody," while findings on segmental pronunciation were "inconclusive."⁷

This points to a plausible fit for Japanese. The learner copies the model's pitch contour and timing as a whole rather than rebuilding it sound by sound. For that reason, shadowing is well suited in principle to mora-timing and pitch-accent, which are properties of the rhythm and melody of an utterance rather than of isolated phonemes. This is a reasoned extension of the suprasegmental evidence, not a result measured for Japanese specifically. No controlled study has yet isolated shadowing's effect on Japanese mora-timing or pitch-accent.⁷

Why Both at Once Beats Either Alone

The simultaneity is the point. In a single repetition, the input effect of close listening, the output effect of articulation, and real-time monitoring all occur together. You are perceiving and producing the same stretch of speech in the same instant.⁴ That couples perception to production instead of training them on separate days.

This is also why one exercise can yield gains across skills. The working-memory study found that shadowing training improved working-memory performance and engaged the articulatory-rehearsal system. That finding is consistent with a single drill loading both the perceptual and the production sides of the phonological loop.¹

What the Research Says

The honest summary is that the mechanism is well argued, while the effect sizes are study-specific. The cognitive account of the phonological loop, subvocal rehearsal, and perception-production coupling is coherent. Kadota's framework and neuroimaging support it.⁴¹ The size of measured learning gains, however, varies by study, proficiency level, outcome measure, and dosage.

No single percentage captures how much shadowing helps

There is no generalizable headline figure for shadowing's benefit, and this article reports none by design. Any quantitative claim you see should be tied to a specific study and limited to that study's learners and measure. It should not be lifted out as a universal number.²⁷

The firmest results cluster in two places: bottom-up listening and perception, and suprasegmental pronunciation. Phoneme perception improved across proficiency levels in Hamada's study. The 2025 review locates the strongest pronunciation gains in prosody, fluency, comprehensibility, and intelligibility.²⁷

The benefit may also depend on proficiency. In Hamada's study, only lower-proficiency learners showed gains on the listening-comprehension test. Intermediate learners did not, even though both groups improved on phoneme perception.² Shadowing's comprehension payoff is not uniform across levels.

One limitation should temper how far any of this is carried over to Japanese. Much of the foundational research tests Japanese learners studying English, not learners of Japanese. Applying these results to Japanese as the target language is therefore a reasonable inference rather than a demonstrated finding. The 2025 review also reports through narrative synthesis rather than a pooled effect size. It flags methodological limits including over-reliance on controlled speaking tasks.⁷

Murphey's contribution sits apart from the efficacy studies. He documented "conversational shadowing" (complete, selective, and interactive) as both a naturally occurring interactional phenomenon and a classroom activity. He argues that shadowing sits between deliberate and automatized language use. This is a descriptive, exploratory framing, not a controlled gains trial.⁸

Good to know

Shadowing is a speaking technique too, and this is its home

Shadowing trains the mouth as much as the ear, so it could plausibly live under pronunciation or speaking as easily as under listening. This article is its canonical home. The perception-production coupling it relies on means the listening and speaking gains are not separable into two different drills. Treating it as primarily a listening practice reflects where its strongest evidence sits.²⁷

Shadow audio slightly below your comprehension level

Material should be easy enough that you can actually track it in real time. If the audio is above your level, you fall off the stream and stop shadowing meaning, producing only noise. This aligns with the finding that shadowing's listening payoff concentrated in lower-proficiency learners working with level-appropriate material.² The "slightly below" heuristic is a teaching recommendation. The supporting data point is that proficiency dependence, not a study that directly tested difficulty calibration.

Shadowing JLPT audio trains JLPT ears, not native ears

Shadowing trains the ear to whatever input it is fed, improving bottom-up perception of the specific speech it practices.²⁶ What you feed it can be calibrated by JLPT listening level. Practicing only slow, over-articulated test-style audio builds perception of slow, over-articulated speech, not of native-rate connected speech. For that reason, native-rate material belongs in the rotation as well. There is no single study pitting JLPT audio against native audio for shadowing. This is a reasoned implication of the mechanism, not a cited finding.

It feels uncomfortable, and that is the point

The short, near-simultaneous lag is designed so that you cannot stop to translate or reconstruct each word. The discomfort of speaking before you have fully understood is the mechanism. It forces direct processing from ear to mouth rather than the slower route through your first language.⁴² Pausing to understand each word converts shadowing into delayed repetition, the less demanding and less targeted exercise.²

References

Takeuchi, Hikaru, et al. "Effects of Training of Shadowing and Reading Aloud of Second Language on Working Memory and Neural Systems." Brain Imaging and Behavior, vol. 15, no. 3, 2021, pp. 1253–1269. https://doi.org/10.1007/s11682-020-00324-4 ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰
Hamada, Yo. "Shadowing: Who Benefits and How? Uncovering a Booming EFL Teaching Technique for Listening Comprehension." Language Teaching Research, vol. 20, no. 1, 2016, pp. 35–52. https://journals.sagepub.com/doi/10.1177/1362168815597504 ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶
"Speech Shadowing." Wikipedia. https://en.wikipedia.org/wiki/Speech_shadowing ↩ ↩² ↩³
Kadota, Shuhei. Shadowing as a Practice in Second Language Acquisition: Connecting Inputs and Outputs. Routledge (Routledge Research in Language Education), 2019. ISBN 9781138485501. https://www.routledge.com/Shadowing-as-a-Practice-in-Second-Language-Acquisition-Connecting-Inputs/Kadota/p/book/9781032092836 ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸
Hamada, Yo. "Shadowing: What is It? How to Use It. Where Will It Go?" RELC Journal, vol. 50, no. 3, 2019, pp. 386–393. https://journals.sagepub.com/doi/10.1177/0033688218771380 ↩ ↩²
Hamada, Yo. Teaching EFL Learners Shadowing for Listening: Developing Learners' Bottom-up Skills. Routledge (Routledge Research in Language Education), 2016. ISBN 9781138935983. https://www.routledge.com/p/book/9780815360902 ↩ ↩² ↩³
Whitworth, Benen, et al. "A Systematic Review of Research on the Use of Shadowing for Second Language Pronunciation Teaching." Research in Language Education (Taylor & Francis), 2025. https://www.tandfonline.com/doi/full/10.1080/29984475.2025.2546827 / Oxford ORA copy: https://ora.ox.ac.uk/objects/uuid:3104cae1-3a6b-400a-b173-de44384238d2 ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
Murphey, Tim. "Exploring Conversational Shadowing." Language Teaching Research, vol. 5, no. 2, 2001, pp. 128–155. https://journals.sagepub.com/doi/abs/10.1177/136216880100500203 ↩

Overview​

What Shadowing Is​

Where the technique came from​

How Shadowing Differs from Parroting and Overlapping​

Why It Works: The Mechanism​

Listening: Training the Ear​

Speaking and Pronunciation: Training the Mouth​

Why Both at Once Beats Either Alone​

What the Research Says​

Good to know​

Shadowing is a speaking technique too, and this is its home​

Shadow audio slightly below your comprehension level​

Shadowing JLPT audio trains JLPT ears, not native ears​

It feels uncomfortable, and that is the point​

See also​

References​

Footnotes​