How to Read Long Kanji Strings: Chunking Three, Four, Five, and Six-Kanji Compounds
To read long kanji compounds, treat each unbroken string of three, four, five, or six characters as a stack of smaller compounds. Chunk the string before you try to pronounce it. Government, academic, and corporate Japanese leans on this stacking habit, and chunking first turns a wall of kanji into familiar pieces.123
Overview
Why long kanji strings need a procedure, not memorization
The jōyō list is a finite 2,136 characters, but the compound space they generate is open-ended. Government, legal, academic, and corporate prose stack 漢語 (kango, Sino-Japanese vocabulary) into strings of three, four, five, six, and more kanji that no learner can pre-memorize wholesale.123
Japanese middle-school 国語 (Japanese language arts) teaches this material as a closed set of construction patterns because the compounds change but the patterns do not.123 The same three patterns that organise three-kanji compounds reappear in longer strings.
The compounds this article covers
In scope: three-kanji compounds (三字熟語), non-idiomatic four-kanji compounds, five- and six-kanji government and academic strings, and the longer "compound-of-compounds" pattern that produces titles of seven kanji and up.123
Out of scope: idiomatic four-character yojijukugo whose meanings are non-compositional (一期一会, 弱肉強食), and proper-name strings whose readings are 名乗り readings. Those are handled by their own articles in the cluster.
The two questions every long string forces
Every long string forces two questions in sequence: (1) where does the string chunk, and (2) how does each chunk read? Japanese middle-school texts present these as separate stages: 構成 (structure) before 読み方 (reading). The same order works for a non-native reader.123
Three-kanji compounds (三字熟語): the three construction patterns
Three-kanji compounds organise into three main patterns: prefix + two-kanji compound, two-kanji compound + suffix, and three independent coordinated kanji.14 The diagnostic at the end of this section helps you identify the pattern for a given string.
Pattern A: prefix + two-kanji compound (一字+二字)
The first kanji is either a negation prefix or a modifier prefix. The remaining two characters are an independent jukugo.1 The reading default is straightforward: the prefix carries its on'yomi, and the two-kanji base reads as a normal on+on jukugo.14
Negation prefixes form a closed shortlist: 不, 無, 非, 未, 反. The National Institute for Japanese Language and Linguistics study by Nomura shows these four prefixes attach overwhelmingly to two-character kango bases, with 非 the only one that regularly attaches to longer (three-character-plus) bases.5
Modifier prefixes form an open but still finite shortlist. The most learner-relevant items are 新, 旧, 高, 低, 大, 小, 全, 各, 再, 諸, 超, 総, 多, 微, 猛.4
不安定です。1
"It is unstable."
新記録を更新した。1
"A new record was set."
不景気が続いている。1
"The recession is ongoing."
高性能のカメラを買った。1
"I bought a high-performance camera."
Negation-prefix three-kanji compounds (不安定, 不景気, 無責任, 未解決, 非常識) work across registers and appear in news, academic prose, and conversation alike.5 Modifier-prefix three-kanji compounds (新記録, 高性能, 大事件) skew toward news and journalism.4
Pattern B: two-kanji compound + suffix (二字+一字)
The two-kanji base modifies a final character that is either a content-bearing noun (室, 国, 場, 館, 学, 業, 表, 員, 家, 力, 度, 省, 庁) or a bound productive suffix (化, 性, 的, 然, 派, 味, 感, 格).14 The Handbook of Japanese Lexicon and Word Formation documents 性, 化, and 的 as the three most productive Sino-Japanese suffixes in the modern lexicon.6
The reading default is simple: the right-edge character is on'yomi because it is a bound kango morpheme, and the base is on+on.1
音楽室で練習する。1
"I practice in the music room."
図書館で勉強した。1
"I studied at the library."
近代化が進む。1
"Modernization advances."
具体的に説明します。1
"I will explain concretely."
This pattern also includes 大学生 (daigaku-sei) and 商業化 (shōgyō-ka): 大学+生 and 商業+化 are both a two-kanji base plus a productive suffix, so they fit cleanly inside Pattern B.76 A common parsing mistake is to read 大学生 as 大+学生 (dai-gakusei). The pronunciation comes out the same by accident, but the structure is 大学+生. The same error on 不可解 (1+3, see the four-kanji section) produces a non-word.
大学生になりました。7
"I have become a university student."
The 化 / 性 / 的 family is formal across contexts: it dominates academic, government, and news prose, and is productive enough that journalists coin one-off three-kanji compounds with it.6 The 室 / 館 / 場 / 国 / 学 / 業 / 員 / 家 family is the older Sino-Japanese institutional layer. It is also broadly formal, but less productive in new coinages.14
Pattern C: three independent kanji (一字+一字+一字)
Here, three semantically coordinated heads sit side by side. There is no nesting and no prefix or suffix structure.1 The canonical examples form a near-closed set: 衣食住, 松竹梅, 真善美, 雪月花, 心技体, 天地人.1 All of them inherit a Chinese classical tradition of triadic enumeration: lists of three cosmic, ethical, or aesthetic categories.
All-on-reading is the default, and the register is set-phrase or aphoristic.14
衣食住の心配がない。1
"We have no worries about food, clothing, or shelter."
心技体のバランスが大切だ。1
"Balance of mind, technique, and body is essential."
雪月花を愛でる。1
"To appreciate snow, moon, and blossoms (the classical triad of natural beauty)."
Diagnostic: which of the three is this?
A three-step test sorts most three-kanji strings at first glance.
| Step | Test | Outcome |
|---|---|---|
| 1 | First character is in {不, 無, 非, 未, 反, 新, 旧, 高, 低, 大, 小, 全, 各, 再, 諸, 超, 総, 多, 微, 猛} | Pattern A (1+2) |
| 2 | Last character is in {化, 性, 的, 然, 派, 味, 感, 格, 室, 国, 場, 館, 学, 者, 業, 員, 表, 力, 度, 省, 庁, 家} | Pattern B (2+1) |
| 3 | Neither edge is a recognised affix | Pattern C (1+1+1) coordinated triad, or look up |
If a string passes step 3 but the three kanji do not form a recognised set phrase, the compound may be a frozen jukujikun with a non-compositional reading. In that case, use the look-up path.145
Four-kanji compounds that are not yojijukugo: the construction taxonomy
Four-kanji strings split mainly 2+2, with smaller 1+3, 3+1, and 1+1+1+1 sub-classes. The middle-school 国語 taxonomy lists exactly these subtypes and notes that 一字+三字 and 三字+一字 also occur.23
The default split: 2+2 (二字+二字)
This is the dominant pattern for non-idiomatic four-kanji strings.23 Inside the 2+2 frame, the two halves stand in one of four jukugo construction relations: synonym pairing, antonym pairing, modifier-head, or subject-predicate. The middle-school list gives the first three explicitly; the verb-object relation carries over from the two-kanji jukugo level.23
国際関係を学ぶ。8
"I study international relations."
経済成長が続く。2
"Economic growth continues."
Less common: 1+3 and 3+1
The middle-school text explicitly notes that 一字+三字 and 三字+一字 four-kanji compounds also exist.2 The same prefix and suffix shortlists from the three-kanji section apply, one chunk longer.
The 1+3 form is a negation or modifier prefix attached to a three-kanji base. 不可解 (fu-kakai, 不 + 可解) is the textbook case: 可解 is the base, and 不 is the negation prefix attached to it.5 不可抗力 (fu-kakōryoku, 不 + 可抗力) is a longer instance of the same shape.
The 3+1 form is a three-kanji base plus a productive suffix. 都道府県別 (to-dō-fu-ken-betsu, 都道府県 + 別) is a clear example: 別 is the productive sorting suffix attached to the four-prefecture-types set phrase.3
不可解な事件だ。5
"It is an incomprehensible case."
All-equal: 1+1+1+1
Here, four semantically coordinated heads sit side by side, with no nesting and no affix. Examples: 都道府県 (to-dō-fu-ken), 春夏秋冬 (shun-ka-shū-tō), 東西南北 (tō-zai-nan-boku), 喜怒哀楽 (ki-do-ai-raku), 花鳥風月 (ka-chō-fū-getsu).2 The reading is all on'yomi, with classical-Chinese parallelism. 起承転結 (ki-shō-ten-ketsu) is structurally four equal heads but is conventionally listed as a yojijukugo by frequency.2
春夏秋冬の景色が美しい。2
"The scenery of all four seasons is beautiful."
The yojijukugo boundary
Structurally, a four-kanji string belongs to one of the patterns above. It becomes a yojijukugo when its meaning is idiomatic and lexicalised (一期一会, 弱肉強食 in the antonym-pair list).2 Transparent compounds such as 株式会社 (2+2)9 and 国際関係 (2+2)8 are not yojijukugo even though they are four kanji.
Five and six-kanji strings: stacked compounds and their chunking
The compound-of-compounds principle
Past four kanji, almost every kango string is itself a compound of smaller compounds.3 The middle-school text notes that compounds of four kanji and up reuse the same combination patterns as two- and three-kanji compounds, plus 二字熟語+二字熟語 and 二字熟語+三字熟語.3
| String length | Default chunk patterns |
|---|---|
| Five kanji | 2+3, 3+2, 1+4, 4+1 |
| Six kanji | 2+2+2, 3+3, 2+4, 4+2 |
| Seven and up | recursive stacking of the above |
The chunk borders almost always sit on the boundary between an established sub-compound and a productive prefix or suffix.123
Worked example: 国際関係論 (kokusai-kankei-ron)
Chunking is 2+2+1: 国際 (international) + 関係 (relation) + 論 (theory, an academic-discipline suffix). All three units read on'yomi end to end, and no rendaku fires. The Britannica International Encyclopedia entry defines 国際関係論 as the academic field studying relations between states and between states and international organizations.8
Here is the wrong-chunk test: 国 + 際関係論 fails because 際関係 is not a word. 国際関 + 係論 fails because neither half is a word. Only 2+2+1 gives known sub-compounds at every cut.
大学で国際関係論を専攻している。8
"I am majoring in international relations at university."
Worked example: 一般教養 (ippan-kyōyō)
Chunking is 2+2 (modifier-head): 一般 (general) + 教養 (cultivation, liberal-arts knowledge).10 The sound change is internal to the first half: 一般 surfaces as ippan, not ichi-han. In on+on kango, the etymological -tsu / -chi coda of the first kanji often becomes a small-tsu that produces gemination with the following consonant. The change is internal to 一般 and unaffected by the larger 2+2 seam.
一般教養の授業を受ける。10
"I take general-education classes."
Worked example: 株式会社 (kabushiki-gaisha)
Chunking is 2+2 (modifier-head): 株式 (stock, shares) + 会社 (company).9 The seam is the standard example of rendaku inside a longer kango stack: 会社 kaisha voices to -gaisha after the compound boundary. Vance puts the rendaku rate across on+on Sino-Japanese binoms at roughly ten percent, with 株式会社 as one established case.1112 Lyman's Law does not block here because 会社 contains no voiced obstruent.1112
Both かぶしきがいしゃ and かぶしきかいしゃ are recorded as readings of 株式会社 in major Japanese dictionaries.913 The rendaku form (-gaisha) is by far the more frequent in everyday speech and broadcasting. The unrendaku form (-kaisha) survives in legal, registration, and explicit-reading contexts. A primary-source statement of NHK's broadcasting-style decision was not located within this research pass, so the rendaku-dominant claim is based on the dictionary lemma order and common usage in 9 and 13, not directly on a 放送用語委員会 document.
The 1873 第一国立銀行 is recognized as the prototype Japanese 株式会社, and the 1899 商法 codified the corporate form.13 In English-language filings, the form is rendered K.K.
株式会社を設立する。9
"They establish a joint-stock company."
The 株 / shiki seam also illustrates the wago-inside-kango exception discussed in the on'yomi-continuation section below: 株 reads its kun'yomi kabu (not the on'yomi shu), so 株式 itself is yutō-flavoured before it enters the larger 2+2 structure.913
Worked example: 内閣総理大臣 (naikaku-sōri-daijin)
Chunking is 2+2+2: 内閣 (cabinet) + 総理 (general management, "premier") + 大臣 (minister).1415 All six kanji read on'yomi end to end, and no rendaku fires at any seam. This is the canonical case of maximum on'yomi continuation in a six-kanji government title.
The larger structure is [内閣 の 総理大臣], the cabinet's chief minister. Inside the second half, 総理大臣 itself parses 2+2 (総理 modifier + 大臣 head). The Prime Minister's Office gives the reading directly with full furigana 「内閣総理大臣 ないかくそうりだいじん」 in its official explainer material.14
内閣総理大臣が記者会見を開いた。14
"The Prime Minister held a press conference."
Worked example: 文部科学省 (Monbu-kagaku-shō)
Chunking is 2+2+1: 文部 (education-and-letters, the historical short form of 文部省) + 科学 (science) + 省 (ministry suffix).1617 The 2001 中央省庁再編 (central government ministry reorganization) merged the former 文部省 (Ministry of Education) with the 科学技術庁 (Science and Technology Agency). The new name encodes that merger: 文部 from the older ministry and 科学 from the absorbed agency.1617
Other 2+2+1 ministry names produced by the same restructuring include 厚生労働省 (Kōsei-rōdō-shō: 厚生 + 労働 + 省) and 国土交通省 (Kokudo-kōtsū-shō: 国土 + 交通 + 省).16
文部科学省が新しい指針を発表した。16
"The Ministry of Education, Culture, Sports, Science and Technology released a new guideline."
Worked example: 日本国憲法 (Nihonkoku-kenpō)
Chunking is 3+2, with internal 2+1+2 sub-chunking: 日本 (Japan) + 国 (country, state) + 憲法 (constitution).18 国 attaches to 日本 to form 日本国 (the State of Japan), which then takes 憲法 as the head. The colloquial form is just 憲法. The formal proper-name form is 日本国憲法.
The 憲法 sub-compound itself surfaces as kenpō because the on'yomi of 法 (hō) assimilates to -pō after the -n coda of 憲. This handakuon (半濁音, p-sound) assimilation is internal to the 憲法 sub-compound and does not involve the larger seam.11 Promulgation was 1946年11月3日, with enforcement on 1947年5月3日.1819
日本国憲法は1946年に公布された。18
"The Constitution of Japan was promulgated in 1946."
Seven and beyond
The full Japanese name of UNESCO, 国際連合教育科学文化機関 (kokusai-rengō-kyōiku-kagaku-bunka-kikan), is eleven kanji.2021 It chunks as 国際連合 + 教育 + 科学 + 文化 + 機関, or 4 + 2 + 2 + 2 + 2. If you sub-split 国際連合 itself, the pattern is 2+2 / 2 / 2 / 2 / 2. The larger structure is [国際連合 [教育・科学・文化] 機関], the UN [education-science-culture] agency. All eleven kanji read on'yomi.2021
国際連合教育科学文化機関の活動を支援する。20
"We support the activities of UNESCO."
The same chunk-of-chunks logic scales without change. Ministry, treaty, and UN-agency names commonly reach eight to twelve kanji, and the reading procedure is identical to the six-kanji case.
The on'yomi-continuation rule and the kango register
Why long strings read on'yomi end to end
Long kanji compounds are almost all kango, and kango compounds default to on'yomi at every constituent kanji.13 Once you identify a compound as kango, read every kanji inside it with on'yomi unless a specific sub-compound is wago.
The two-kanji-jukugo literature already states this default: kango combinations are basically on+on (or kun+kun for wago). Jūbako (on+kun) and yutō (kun+on) are the marked minority.3 In long strings, the on+on+on+... pattern dominates by an even wider margin because productive prefixes, suffixes, and ministry, academic, and legal vocabulary are all kango.6
The narrow exceptions: where kun'yomi sneaks back in
Sub-compounds that are themselves wago keep their kun'yomi inside a longer kango wrapper. 株式 kabushiki is the standard example: 株 reads its kun'yomi kabu (not the on'yomi shu), and 式 reads its on'yomi shiki. That makes 株式 itself yutō-ish, and the unit then enters the larger 2+2 structure of 株式会社.913
Place-name elements similarly mix readings. 大阪市 (Ōsaka-shi) is a kun+kun place name plus the on'yomi suffix 市. Inside a longer government string such as 大阪市役所 (Ōsaka-shi-yakusho), the wago place name keeps its kun'yomi while the wrapping suffixes stay on'yomi.
Jūbako and yutō sub-compounds inside longer strings are tagged for the reader by the dictionary entry. You cannot predict them from the chunk shape alone, so the look-up branch handles them.
Rendaku at the seam
Rendaku fires at the boundary between sub-compounds (株式 + 会社 -> kabushiki-gaisha)911 and is blocked inside a pure on+on kango stack (内閣 + 総理 stays naikaku-sōri, no voicing).1411 Vance's figure is that roughly ten percent of Sino-Japanese binoms undergo rendaku; the rate is much higher for wago.1112 Lyman's Law (no rendaku when the second element already contains a voiced obstruent) applies throughout.1112
The practical implication for long strings is simple: assume no rendaku at each seam by default, and treat 株式会社-style voiced seams as memorised exceptions tagged in the dictionary.
Register cost: stacking kango is formal, sometimes pompous
A six-kanji ministry name (内閣総理大臣, 文部科学省) is appropriate in legal, administrative, and news prose, but tonally wrong in casual conversation.1416 The trade-off is the same wago / kango register split documented in the vocabulary-strata literature, applied at the compound-length scale: each additional kanji unit in the stack makes the register a little more formal.226
The deliberately stacked Meiji 和製漢語 (Japan-made Sino-Japanese vocabulary) lexicon (社会, 自由, 経済, 哲学, 民主主義, 共産主義) was designed for academic and legal precision. Its register inheritance shows in the long ministry titles that reuse those morphemes.2322
A reading procedure for a long string you have never seen
The five-step procedure below turns the rules from the preceding sections into a workflow you can run on a fresh string.
Step 1: count kanji and scan for an affix on either edge
Negation prefixes: 不, 無, 非, 未, 反.5 Modifier prefixes: 新, 旧, 高, 低, 大, 小, 全, 各, 再, 諸, 超, 総, 多, 微, 猛.4 Productive suffixes: 化, 性, 的, 然, 派, 味, 感, 格, 室, 国, 場, 館, 学, 者, 業, 員, 表, 力, 度, 省, 庁, 家.146 Strip one affix from the edge and re-evaluate what remains as a shorter string.
Step 2: split the middle on the strongest sub-compound boundary
For four or more kanji, look for a known two-kanji compound inside the string. Once you spot a unit such as 関係, 経済, 国際, 内閣, 教育, 文化, 機関, 委員, 大臣, 株式, 会社, or 憲法, the chunk borders are forced.123
Step 3: assume on'yomi at every kanji unless a wago sub-compound is hiding
The on'yomi-continuation default fails perhaps five to ten percent of the time at this length. It is almost always because a wago sub-compound is embedded (株式, place names, jūbako or yutō residues).39
Step 4: apply rendaku at the seams, not inside the sub-compounds
Only the boundary between two sub-compounds is a rendaku site. Inside a sub-compound, rendaku does not fire across the kanji-internal boundary. Vance's ten-percent figure for on+on binoms means the default expectation is "no rendaku," with a small set of memorised exceptions.1112
Step 5: if the result is not a word, abandon and look it up
If the best-effort parse does not match any attested word, the string is one of these: a jukujikun (frozen non-compositional reading), a proper name (place name, person name, organisation name with 名乗り readings), a 和製漢語 specialist term whose component readings are predictable but whose meaning is not,2322 or a typo. Hand the string to the look-up toolkit (radical search, OCR, Yomitan).
Good to know
Pitfall: forcing a 2+2 split on a 1+3 four-kanji compound
不可解 reads fu-kakai and is structured as 不 + 可解 (1+3, a negation prefix on a three-kanji base). Forcing a 2+2 split on it leaves a non-word.5 The same edge-affix-first instinct that pulls 大学生 apart as 大学 + 生 (the right answer at three kanji) pulls 不可解 apart as 不 + 可解 (the right answer at four). Strip productive affixes from the edges before you try the middle split.
不可解な事件だ。5
"It is an incomprehensible case."
Pitfall: treating ministry and corporate strings as one giant word
内閣総理大臣 has three semantic units (内閣 + 総理 + 大臣).1415 Pronouncing each unit on its own breath beat, naikaku - sōri - daijin, makes the structure audible and lets the formality land without effort. Reading it as one ten-mora blob makes the same string sound rushed and indistinct.
Pitfall: reading a sub-compound's kun'yomi where its kango wrapper demands on'yomi (or vice versa)
大人気 versus 大人気ない is the textbook minimal pair: same three kanji, different chunking. As 大 + 人気 (1+2), the reading is dai-ninki and the meaning is "great popularity." As 大人 + 気 (2+1), the reading is otonage, and the word appears only in the negated form 大人気ない or 大人気無い. The 精選版 日本国語大辞典 entry tags the おとなげ reading as 「下に打消の表現を伴って用いる」, that is, used only with a following negative. Without 無い or 無かった, the same three kanji read だいにんき.24
大人気のアニメだ。24
"It is a hugely popular anime."
Register: when a long kanji string is the right register and when it is not
Legal, administrative, academic, and corporate prose default to kango stacks (内閣総理大臣, 文部科学省, 国際連合教育科学文化機関, 日本国憲法).14161820 Conversation, fiction dialogue, and lifestyle writing usually strip them back to two-kanji jukugo plus wago. Mismatching the register reads as pompous in one direction and underdressed in the other.22
Mnemonic: read each sub-compound as if it were a separate word
Mentally insert a hyphen at every seam. 内閣-総理-大臣 reads more naturally than 内閣総理大臣 read as a single blob.
Japanese typesetting does not draw the hyphens. The reader does. The intuition is the same one English readers apply when they parse Donaudampfschifffahrtsgesellschaftskapitän as Donau-Dampf-Schifffahrts-Gesellschafts-Kapitän.
Etymology: the 和製漢語 wave is why government and academic strings stack so deep
Meiji-era translators, most notably 西周 (1829–1897), 福澤諭吉 (1835–1901), and the 明六社 circle, coined thousands of kango by stacking jōyō kanji on classical-Chinese templates. 西周 alone is credited with 哲学, 真理, 芸術, 理性, 科学, 知識, 定義, 概念, 命題, 心理, 物理. 福澤諭吉 is credited with 自由, 経済, 演説, 討論, 競争, 共和, 抑圧, 健康, 鉄道.
The longer the modern string, the more likely its component morphemes are 19th-century or early-20th-century coinages.2322
Many of these morphemes (科学, 文部, 国際, 経済, 教育, 文化, 機関) are the building blocks of the long ministry and UN-agency names parsed earlier in this article. The vocabulary layer and the parsing problem share an origin.
The classical-Chinese tradition behind 1+1+1+1 lists
衣食住, 心技体, 都道府県, 春夏秋冬, 東西南北, 喜怒哀楽, 花鳥風月 all inherit the Chinese classical tradition of paired or quadrupled coordinated heads.12 Recognising the list shape is itself a chunking signal: three or four equal kanji, no affixes, often classical aesthetic or ethical categories. If the shape is right, the all-on-reading parallel-coordination default applies.
A note on the 株式会社 reading
The worked example above flags that both かぶしきがいしゃ and かぶしきかいしゃ are dictionary-attested.913 A reader who needs the broadcasting-style ruling itself, rather than dictionary lemma order, should consult the NHK 放送用語集 in print; that primary source was not available within this research pass.
See also
- Go-on, Kan-on, Tō-on: The Historical Layers Behind a Kanji's Multiple On'yomi
- Rendaku: When K Becomes G in Compound Words
- Japanese Compound-Word Pitch Accent: How Two Words Combine into One Accent Pattern
- The History of Kanji: From Oracle Bones to the Jōyō List
- Ateji (当て字): Kanji Chosen for Sound, Not Meaning
- Formal Written Japanese (である調): The Register