Spaced Repetition and the Forgetting Curve: Why Reviewing on a Schedule Works
Spaced repetition is a review technique: you study the same item again at increasing time intervals. Each review is timed to land near the moment the item is about to be forgotten but is still recoverable.1 It works because memory decays predictably along the forgetting curve. Reviewing just before you forget resets that decay instead of letting it run to zero.2
This article follows one causal chain: the forgetting curve, the spacing effect, why expanding intervals work, and the Leitner-box-to-algorithm lineage. For a Japanese learner facing thousands of kanji and vocabulary items, that chain is practical, not just academic. Unscheduled review of so many low-frequency items is simply unworkable.
Overview: What Spaced Repetition Is and Where It Came From
Spaced repetition uses the spacing effect: learning episodes spread over time produce better long-term retention than the same episodes massed together.1 The technique is old in concept and continuously refined in implementation.
The lineage runs as one chain. The forgetting curve (Ebbinghaus, 1885) established that memory decays predictably. The spacing effect (Ebbinghaus 1885, later quantified across the literature by Cepeda et al. 2006) established that distributing review beats massing it. Expanding intervals put this into practice, and the Leitner box (1972) plus later algorithmic systems are successive implementations of the same idea.3145
The forgetting curve: Ebbinghaus and how memory decays
Hermann Ebbinghaus first described the forgetting curve in Über das Gedächtnis (On Memory), published in 1885. The work appeared in English in 1913 as Memory: A Contribution to Experimental Psychology, translated by Henry Ruger and Clara Bussenius.32 He is cited as the founder of the experimental study of memory.2
His method was austere. Ebbinghaus learned and relearned lists of nonsense syllables on himself. These were consonant-vowel-consonant strings built to avoid pre-existing associations. He measured retention with the "savings method": how much less effort relearning required after a delay.2
The shape that emerged is the durable result. Retention drops steeply soon after learning, then the rate of loss slows and the curve flattens, so most of the loss happens early. This is the qualitative shape that motivates reviewing soon after first study.32
The steep early drop is the whole reason the first review needs to come quickly. A diagram makes the asymmetry between early and late loss immediate.
The specific percentages often reproduced from Ebbinghaus (for example "X% gone after one day") describe one person learning meaningless syllables under specific conditions. They illustrate the decay shape; they are not a universal constant. Real decay depends on the material's meaningfulness, prior knowledge, interference, and the learner. Treat the curve's shape as durable, but not the exact numbers.2
The spacing effect: distributed beats massed
The spacing effect is the finding that, for a fixed amount of total study, spreading the study across separated sessions gives better long-term retention than concentrating it in one session. That concentrated approach is cramming, or "massed practice."1
Ebbinghaus demonstrated it himself. For relearning a series of syllables, he reported that "38 repetitions, distributed in a certain way over the three preceding days, had just as favorable an effect as 68 repetitions made on the day just previous."2 He concluded that, "with any considerable number of repetitions a suitable distribution of them over a space of time is decidedly more advantageous than the massing of them at a single time."2
That 1885 result is the founding quantitative finding for spacing. The modern synthesis confirms it at scale.
Cepeda, Pashler, Vul, Wixted, and Rohrer (2006) meta-analyzed the distributed-practice literature, drawing on 839 assessments across 317 experiments in 184 articles.1 They found a robust spacing benefit. They also found that the inter-study interval producing maximal retention increases as the retention interval (the delay until the test) increases.1 In short, the longer you need to remember something, the wider the optimal gaps between reviews.
One distinction holds the rest of the article together. The spacing effect itself is one of the most replicated findings in memory research, backed by well over a century of data from Ebbinghaus onward.21 The precise intervals an app schedules are a different kind of thing.
Why Expanding Intervals Work
Every spaced-repetition system lengthens the gap after a successful recall and shortens it after a failure. The reason is simple: a memory just refreshed is temporarily highly accessible, so reviewing it again immediately wastes effort.16
Waiting until accessibility has partly decayed makes the next successful retrieval do more work. The schedule therefore widens after each success and resets toward shorter gaps after a lapse.
Reviewing at the edge of forgetting
Robert Bjork (1994) coined "desirable difficulties" for conditions that slow acquisition but improve long-term retention and transfer, spacing and retrieval practice among them.7 The difficulty is "desirable" because the extra effort at retrieval is what strengthens the memory.7
A formal model explains why effort helps. Bjork and Bjork's New Theory of Disuse gives each memory item two strengths. Storage strength reflects how deeply the item is learned and effectively only grows. Retrieval strength reflects how accessible the item is right now and fluctuates with recency and cues.6
The theory predicts that the gain in storage strength from a successful retrieval is larger when retrieval strength has dropped. In other words, recall helps more when it is effortful but still succeeds.6 That is the formal reason to review near the edge of forgetting rather than while the item is still easy.
Retrieval-practice and spacing research supports the direction: longer gaps after success, shorter gaps after a lapse. It also supports the key variables: how long you need to retain the item, and how hard recall is at that moment. It does not support any single promised number, so name the variables, not a fixed schedule.16
Active recall, not re-reading
Roediger and Karpicke (2006) showed that taking a test on material, meaning retrieving it from memory, produces better long-term retention than restudying the same material for the same time.8 The act of retrieval, not re-exposure, is what consolidates the memory.8
The effect holds even when restudy feels more productive. Karpicke and Roediger (2008) had students learn foreign-language vocabulary pairs. Once an item was recalled correctly, it was either kept in the study rotation, kept in the test rotation, or dropped.9 Repeated studying after the first success did essentially nothing for delayed recall. Repeated testing, however, produced a large benefit.9
In the same study, students' own predictions of later performance did not correlate with how they actually did. The method that worked felt worse while doing it.9 If you judge a study technique by how fluent it feels in the moment, you will tend to pick the weaker one.
This is why a spaced-repetition card forces you to produce the answer before revealing it. A flashcard you flip and try to recall is a retrieval trial; a list you reread is not.98
From the Leitner Box to the Algorithm
Sebastian Leitner's box system (1972)
Sebastian Leitner described the Lernkartei (learning card file) box system in his 1972 book So lernt man lernen (How to Learn to Learn). He was among the first to advocate systematic flashcard learning built on the spacing effect.4
The mechanism is mechanical and physical. Cards move through a series of boxes, commonly five. A card answered correctly is promoted to the next box. A card answered incorrectly is sent back to the first box.4 Higher-numbered boxes are reviewed less frequently than lower ones, so well-known cards resurface rarely and struggling cards resurface often.4
This is fixed, box-level scheduling: the interval is a property of the box, not of the individual card.4 The Leitner box is the manual, analog ancestor of digital SRS. It already implements promotion on success, demotion on failure, and expanding intervals, all by hand.4
The 1972 date is the German first edition of Leitner's book. Some English summaries date the "Leitner system" to 1973 or cite later editions, but the underlying method is the same. The precise date anchors to the 1972 book.4
What software added: per-card scheduling
Algorithmic spaced-repetition software replaced Leitner's box-level intervals with per-card intervals computed from each card's own review history. Instead of saying, "all cards in box 4 are due in N days," the system tracks each card individually and predicts when that specific card will approach the edge of forgetting.5
Piotr Woźniak's SuperMemo, first released in the late 1980s, introduced computed per-item intervals. Woźniak and Gorzelańczyk (1994) published the optimization work behind the family of SM (SuperMemo) algorithms. Their work formalized how an item's interval and a difficulty estimate update after each graded recall.5
The progression from fixed boxes to per-card algorithms is a continuous lineage: the same expanding-interval idea given finer resolution. The specific algorithm a given app uses, and how those algorithms differ from one another, is an engineering topic in its own right and is treated as a separate concept rather than expanded here.5
Spaced Repetition for Japanese Specifically
Why the volume makes SRS close to non-optional
The scale of the problem sets Japanese apart. The Japanese government's jōyō kanji list, the standard list of characters for general use, was set by Cabinet notification in 2010. It contains 2,136 characters as the baseline for general literacy, and functional reading vocabulary runs to many thousands of words on top of that.10
That volume interacts badly with the forgetting curve, because each of those thousands of items decays on its own curve.2 For an early learner, most individual kanji and low-frequency words appear rarely in daily input. The natural next exposure often arrives after the item has already decayed past recall.2 Unscheduled, encounter-driven review cannot reliably catch low-frequency items before they are lost.21
Scheduling is the fix. A spaced-repetition schedule guarantees that each item resurfaces near its own edge of forgetting, whether or not it appeared in that day's reading. That is exactly the gap that natural exposure leaves open at the beginner stage.16
Where SRS stops and immersion starts
SRS does one job well: it seeds and maintains recognition of discrete items, such as a kanji reading or a word's meaning, efficiently and durably. The spacing and testing research directly supports this recognition-and-retention role.19
It does not do the rest. Retrieval of isolated items does not by itself build reading fluency, listening comprehension, or production. All of those require processing language in context.9
The honest framing is that SRS handles the memorization bottleneck. That lets immersion and output do the work only they can do. SRS is the complement to comprehensible input and output, not a substitute for them.9
Good to know
The round-number intervals are a heuristic, not a law
The "review at 1 hour / 1 day / 1 week / 1 month" schedule is a teaching illustration of expanding intervals, not a measured constant. The research supports the direction: space reviews, and widen gaps as the required retention horizon grows. It does not support any universal set of numbers. The optimal gap depends on the item, the learner, and how long the material must be retained.1
This is why no fixed table should be presented as scientific fact. Cepeda et al. found that the best inter-study interval shifts with the retention interval. A single fixed table cannot be optimal for all goals at once.1
Spacing is not the same as how many cards you add
Spacing governs when an item returns. It says nothing about how many new items you introduce per day. Adding cards faster does not make spacing work harder. It raises daily review volume and can outrun the time available, a separate problem from whether the schedule is well spaced.1
The spacing effect is about distribution over time, not throughput. How many new cards to introduce per day is a distinct decision the learner controls. It is the lever behind most review-load trouble, rather than the spacing schedule itself.
"SRS" vs "spaced repetition" vs "spaced practice"
The spacing effect (also "distributed practice" in the experimental literature) is the research phenomenon: spaced study beats massed study for retention.1 Spaced repetition and spaced practice are names for applying that phenomenon deliberately as a study technique.1
SRS (spaced-repetition system, or the software) is the category of tools, from the Leitner box through SuperMemo and its successors, that automate the scheduling.45 The effect is the science. The SRS is the implementation. Keeping the two distinct is what stops "the algorithm" from being mistaken for "the science."145
See also
- How to Learn Kanji: A Strategic Overview of Heisig, WaniKani, and Kanji-in-Context
- WaniKani Explained: How the Radical, Kanji, and Vocabulary SRS System Actually Works
- Hours per Day vs. the Marathon: Pacing Your Japanese Study
- Second-Language Acquisition: A Primer for Japanese Learners
- Swain's Output Hypothesis: Why Producing Japanese (Not Just Absorbing It) Builds the Language