Skip to main content

Jukugo (熟語): How Kanji Combine to Form Japanese Words

What are jukugo? In Japanese, a jukugo (熟語) is a kanji compound word built from two or more kanji. Together, those kanji behave as a single dictionary entry, and they carry the bulk of written Japanese vocabulary.12 One rule covers most jukugo from day one: when two kanji sit next to each other with no okurigana between them, both characters usually take their on'yomi reading.3

Overview

Jukugo are the fused, all-kanji words that fill every newspaper headline, train sign, and textbook page. The label 熟 (juku) carries the sense "ripe, matured," so the term itself frames the category: a word that has "set" as a unit, not a free combination of two single-kanji entries.1

If you can name three things about a new jukugo, its reading pattern, construction pattern, and register, you can read and use it with more confidence even on first encounter. This article defines the term, anchors the "most of written Japanese" claim in corpus figures, lays out the on'yomi-default rule, and walks through the four construction patterns taught in Japanese junior-high 国語 classes.

What is a jukugo?

The strict definition: two or more kanji, one lexical word

In Japanese 国語 grammar, 熟語 means a word written entirely in kanji and made of two or more kanji. Together, those kanji function as one lexical unit, listed as one headword in standard dictionaries.1 Two-kanji compounds (二字熟語) are by far the most common type. In the 5th edition of 広辞苑 (Iwanami's Kōjien, 1998), one database extraction identifies 78,426 two-kanji compound headwords whose characters both fall inside the JIS Level 1 kanji set. After adjusting for repetitions, that becomes 68,992 distinct written forms.4

The definition draws two boundaries that learners often blur. A 単漢字 (single-kanji word) like 山, 川, or 人 is not a jukugo because it is one character, not a compound. A 漢字仮名交じり word, a mixed kanji-kana word like 食べる or 大きい, is not a jukugo either. The kana attached to the kanji push it out of the all-kanji category.51

学校がっこう九時くじはじまります。6
"School starts at nine."

日本にほん島国しまぐにです。6
"Japan is an island country."

わたし大学生だいがくせいです。6
"I am a university student."

Jukugo are dictionary headwords, not phrases

A jukugo gets one dictionary entry, with one canonical reading, one part of speech, and one definition. That is why you can know both kanji and still fail to guess the word: the meaning has fused beyond the sum of the parts, just as the etymology of 熟 predicts.1

Why "熟" (ripe / well-blended): the etymology

The kanji 熟 (juku) carries the sense "ripen, mature, become fully cooked." It combines with 語 (go, "word") to yield the lexical sense "a word that has matured, that is, set into a fixed unit."1 Dictionaries treat each jukugo as a single sealed entry because that "setting" has happened: the compound's meaning is no longer the simple sum of the two characters.

A canonical illustration is 矛盾 (mujun, "contradiction"), composed of 矛 "spear" and 盾 "shield." The literal reading "spear and shield" does not give you "contradiction." The meaning comes from a Classical Chinese fable about a merchant who claimed his spear could pierce any shield and his shield could stop any spear. The compound has matured into a single lexical item whose meaning a dictionary records once and for all.1

Jukugo vs. compound noun vs. yojijukugo: the boundaries

熟語 is the broad category. 四字熟語 (yojijukugo, "four-character compound") is the four-character subset. Japanese reference works treat it as its own category because most yojijukugo carry an attached idiomatic or proverbial meaning, often borrowed from Classical Chinese.1 A four-character idiom like 一石二鳥 (one stone, two birds) is still a jukugo; it just belongs to a sub-category with extra cultural baggage.

合成語 (gōseigo, "compound word") is a broader morphological term. It covers kanji-kana compounds (花火大会, 食べ放題) and other mixed-script compounds; 熟語 narrows that scope to the all-kanji subset.71 The taxonomy nests cleanly: every jukugo is a 合成語, but not every 合成語 is a jukugo.

Why jukugo matter: their share of real Japanese

The corpus numbers: kango as ~50 to 65% of written word tokens

The 国立国語研究所 (National Institute for Japanese Language and Linguistics, NINJAL) ran a 1994 magazine vocabulary survey covering 90 magazines. 漢語 (kango, Sino-Japanese vocabulary, almost entirely jukugo) accounted for 49.9% of word tokens (延べ語数, total word occurrences) and 34.2% of word types (異なり語数, distinct words).28 The 1956 NINJAL survey used the same magazine corpus design. It put kango at 41.3% of tokens, with wago at 53.9%; by 1994 the figures had inverted, with wago down to 35.7%.98

Newspaper text skews further. In NINJAL newspaper figures, kango exceeds 70% of word tokens. Conversely, in spoken language, wago exceeds 70% of word tokens.8 The headline "roughly half of written Japanese, and over 70% in newspapers" rests on these two endpoints.

Why the year matters on these numbers

Vocabulary share shifts decade by decade. The 41.3% (1956) versus 49.9% (1994) contrast in the same NINJAL magazine corpus is itself the proof.92 Any single-figure claim about kango share that does not name a corpus and a year is not useful for learners.

What this means for a beginner: every page is mostly jukugo

The official JLPT N5 vocabulary inventory (the N5 word list) contains hundreds of kango jukugo: 学校, 大学, 学生, 先生, 日本, 時間, 電車, 図書館, 自転車, 来週. It also includes jukujikun entries like 今日 and 明日 when those are counted in.6 A learner meets jukugo on the first page of the first chapter and on roughly every page after that.

この電車でんしゃ東京駅とうきょうえききます。6
"This train goes to Tokyo Station."

図書館としょかんほんみます。6
"I read books at the library."

This is the structural reason the on'yomi-default rule is a day-one rule rather than an advanced one. Knowing how to read jukugo, in other words, is roughly half of knowing how to read.3

Jukugo and the kango / wago / gairaigo split

The modern Japanese lexicon, or vocabulary, is conventionally analyzed in four strata: 和語 (wago, native Japanese), 漢語 (kango, Sino-Japanese), 外来語 (gairaigo, post-16th-century European loanwords), and 混種語 (konshugo, mixed-stratum compounds).10 Most jukugo are kango, but a meaningful minority of all-kanji compounds are wago. Words like 山道 (yama-michi, "mountain path"), 花見 (hana-mi, "flower viewing"), and 物語 (mono-gatari, "tale") use kun'yomi for both characters. They are native words that happen to be written in kanji.107

混種語 (konshugo, mixed-stratum compounds, including 重箱読み and 湯桶読み hybrid-reading items) accounted for 2.1% of word tokens and 6.4% of word types in the 1994 NINJAL magazine survey.28 A planned sibling article unpacks the strata in full; for the purpose of this orientation, the takeaway is that "jukugo" is mostly but not entirely synonymous with "kango."

How jukugo are read: the on'yomi-default rule

The default: on-on (音音)

When two or more kanji sit next to each other with no okurigana between them, the dictionary-statistical default is on'yomi for every character (the on-on pattern). This default holds for the bulk of two-kanji compound headwords in standard reference dictionaries.7113 Mainstream kanji-learner references state this rule directly: the Kodansha Kanji Learner's Dictionary frames on'yomi as the reading used in nearly all kanji compounds.3

The rule is statistical, not absolute. Jukujikun, ateji, jūbako, yutō, and pure kun-kun compounds make up the remaining cases. Each gets its own section below.1112

学校がっこう勉強べんきょうします。6
"I study at school."

図書館としょかんしずかです。6
"The library is quiet."

電車でんしゃ会社かいしゃきます。6
"I go to the company by train."

In each compound above, every kanji takes its on'yomi: 学 gaku + 校 kō, 勉 ben + 強 kyō, 図 to + 書 sho + 館 kan, 電 den + 車 sha, 会 kai + 社 sha. The pattern is so consistent that you can use it as a reading-prediction tool even before learning any individual character's full reading list.

The minority: kun-kun (訓訓)

A meaningful minority of all-kanji compounds use kun'yomi for both characters. These are typically native Japanese words (wago). Their kanji are spelling choices for an inherited spoken compound, not Sino-Japanese coinages.710 Canonical examples include 山道 (yama-michi, "mountain path"), 花見 (hana-mi, "flower viewing"), 物語 (mono-gatari, "tale"), 朝日 (asa-hi, "morning sun"), and 手紙 (te-gami, "letter").711

The word 手紙 (tegami) illustrates both kun-kun reading and rendaku, or sequential voicing (kami → -gami), in a single compound. It is canonical wago even though it is written in two kanji.13 Recognizing kun-kun as a category prevents a learner from over-applying the on'yomi-default rule to words that simply do not follow it.

山道やまみちあぶないです。6
"Mountain paths are dangerous."

はる花見はなみをします。6
"We go cherry-blossom viewing in spring."

The exceptions worth naming early: jūbako and yutō

Two further patterns mix on'yomi and kun'yomi inside one compound. 重箱読み (jūbako-yomi) names the on-kun pattern: the first kanji takes its on'yomi, the second its kun'yomi. The pattern is named after 重箱 (jū-bako, "tiered box"), which itself follows the pattern.1411

湯桶読み (yutō-yomi) names the kun-on pattern: the first kanji takes its kun'yomi, the second its on'yomi, named after 湯桶 (yu-tō, "hot-water pail").1411

Canonical jūbako examples include 台所 (dai-dokoro, "kitchen"), 本屋 (hon-ya, "bookstore"), 残高 (zan-daka, "balance"), and 団子 (dan-go, "dumpling").14 Canonical yutō examples include 場所 (ba-sho, "place"), 見本 (mi-hon, "sample"), 雨具 (ama-gu, "rain gear"), and 手本 (te-hon, "model").14 These are minority categories, but they are common enough that you should know the labels before meeting your first irregular reading.

PatternFirst kanjiSecond kanjiCanonical exampleReading
音音 (on-on)on'yomion'yomi学校gakkō
訓訓 (kun-kun)kun'yomikun'yomi山道yamamichi
重箱 (jūbako)on'yomikun'yomi台所daidokoro
湯桶 (yutō)kun'yomion'yomi場所basho

When the rule breaks: jukujikun and ateji

Two further categories are not really exceptions to the on'yomi-default rule. They are separate phenomena that need their own names. The companion article Jukujikun covers the case where a single reading is assigned to the whole compound rather than built from individual kanji readings. The canonical example is 今日 (kyō, "today"), in which neither kyō nor any sub-syllable corresponds to either kanji's standard on'yomi or kun'yomi.12 Other examples include 明日 (ashita), 大人 (otona), 今朝 (kesa), and 七夕 (tanabata).12

The companion article Ateji covers the inverse case: kanji selected for their sound rather than their meaning. The canonical example 寿司 (sushi) borrows 寿 (ju / kotobuki, "long life") and 司 (shi / tsukasa, "administer") purely for the syllabic match.15 The on'yomi-default rule predicts the reading of a typical Sino-Japanese kanji-only compound. It makes no prediction at all about jukujikun or ateji. That is why you should treat those two categories as separate phenomena, not as exceptions to be patched.1215

今日きょうさむいです。6
"It is cold today."

寿司すしべました。6
"I ate sushi."

Do not apply the on'yomi-default rule to 今日 or 寿司

Both compounds look like ordinary on-on jukugo on the page, but they are not. 今日 is jukujikun (whole-compound reading), and 寿司 is ateji (sound-only kanji choice). The reading must be memorized, not derived.1215

How jukugo are built: a short tour of the construction patterns

The classical four patterns, named

Japanese 中学校国語 (junior high 国語) curricula teach a fixed classification of 二字熟語 construction patterns. The four core patterns used for an entry-level orientation are 修飾, 並列, 主述, and 述目.57

修飾 (shūshoku), modifier + head. The upper character modifies the lower. Canonical examples: 青空 (aozora, "blue sky," blue modifies sky), 強風 (kyōfū, "strong wind"), 海水 (kaisui, "seawater"), 国道 (kokudō, "national road").5

並列 (heiretsu), coordination. Two characters with similar or opposite meanings sit side by side as equals. Synonym coordination: 道路 (dōro, "road"; 道 way + 路 path), 永久 (eikyū, "eternal"; 永 long + 久 lasting), 絵画 (kaiga, "painting"; 絵 picture + 画 picture). Antonym coordination: 男女 (danjo, "men and women"), 開閉 (kaihei, "open and close"), 強弱 (kyōjaku, "strong and weak").5

主述 (shujutsu), subject + predicate. The upper character is the subject and the lower is the predicate. Canonical examples: 地震 (jishin, "earthquake"; 地 earth + 震 shakes), 骨折 (kossetsu, "bone fracture"; 骨 bone + 折 breaks), 円高 (endaka, "strong yen"; 円 yen + 高 high).5

述目 (jutsumoku), verb + object. The compound reads in reverse semantic order, so the lower kanji is the grammatical object of the upper verb. This pattern mirrors Classical Chinese V-O (verb-object) syntax, making it the construction pattern most visibly inherited from Chinese.7 Canonical examples: 読書 (dokusho, "reading books"; read + book), 握手 (akushu, "handshake"; grasp + hand), 開門 (kaimon, "opening the gate").5

The school-grammar inventory extends beyond these four. Additional categories include 打消し (uchikeshi, negation prefix: 不安, 無休, 非常, 未来), 接頭 / 接尾 (settō / setsubi, affixation: 貴社, 個性, 液化), and 同字反復 (dōji-hanpuku, reduplication: 人々, 山々, 時々).5 The orientation article keeps the canonical four named; a planned sibling article unpacks the full inventory.

青空あおぞらがきれいです。6
"The blue sky is beautiful."

地震じしんがありました。6
"There was an earthquake."

毎日まいにち読書どくしょをします。6
"I read books every day."

男女だんじょ平等びょうどう大切たいせつです。6
"Gender equality is important."

Why naming the pattern helps reading and meaning

Identifying the construction pattern tells you which kanji is the head and therefore what the word fundamentally means. In 青空 the head is 空 (sky) and the modifier is 青 (blue), so the compound is "a kind of sky."7 In a verb-object 述目 compound, the right-to-left semantic order is exactly the diagnostic that lets a learner predict that 読書 means "reading [books]," not "books [for] reading."

The payoff is structural literacy. Two facts about a new compound, its reading pattern and its construction pattern, let you produce a confident first-pass reading and a confident first-pass meaning even before any dictionary lookup. A planned sibling article walks the four patterns in depth.

Longer compounds: three, four, and beyond

Three- and four-kanji compounds are typically parsed by chunking them into 2+1, 1+2, or 2+2 parts. Those parts are usually themselves jukugo or affixes. For example, 国際関係論 (kokusai-kankei-ron, "international relations theory") parses as 国際 (international) + 関係 (relations) + 論 (theory). 株式会社 (kabushiki-gaisha, "joint-stock company") parses as 株式 (stock) + 会社 (company).37

Three-kanji compounds often contain an affix as the first or last character (a 接頭辞 prefix or 接尾辞 suffix). Four-kanji compounds tend to be either a union of two two-kanji compounds (国際関係, 経済成長) or a four-character idiom with a discrete classical source.37 The four-character idiomatic subset gets a dedicated planned sibling article.

株式会社かぶしきがいしゃつとめています。6
"I work at a joint-stock company."

国際こくさい関係かんけいろん勉強べんきょうしています。6
"I am studying international relations."

The word 株式会社 also illustrates rendaku across the 2+2 boundary: 会社 (kaisha) voices to -gaisha after 株式.1316 The voicing happens exactly at the morphological seam, not at the written midpoint. This is strong evidence that 2+2 chunking is real morphology and not an artifact of how the word is written.

Good to know

Jukugo skew formal: register implications for speech

Sino-Japanese kango jukugo tend to carry a more formal, more written-leaning register than their wago paraphrases. This asymmetry is so consistent that standard Japanese linguistics references treat it as a definitional feature of kango.1017 The same basic meaning can often be expressed in two ways in Japanese: once with a kango jukugo and once with a wago verb-plus-noun phrase. The kango version sounds bookish, bureaucratic, or businesslike.

Canonical contrast pairs (kango formal vs wago neutral): 帰宅する (kitaku suru, "return home") vs 家に帰る (ie ni kaeru); 食事をする (shokuji o suru, "have a meal") vs ご飯を食べる (gohan o taberu); 購入する (kōnyū suru, "purchase") vs 買う (kau, "buy"); 開始する (kaishi suru, "commence") vs 始める (hajimeru, "start").10 In casual conversation, "今日は帰宅します" lands as oddly formal, while "今日は家に帰ります" sounds neutral. The trade-off runs in both directions: kango-heavy speech sounds bookish in casual settings and appropriately polished in business settings.

Same kanji, different role: the standalone vs. compound switch

The on'yomi-default rule has one everyday consequence that catches almost every learner: a single kanji read in isolation typically uses its kun'yomi, while the same kanji inside a jukugo typically uses its on'yomi. The kanji 人 is the textbook illustration. Alone, it is hito (kun'yomi). Inside 日本人 (nihon-jin), 三人 (san-nin), or 人間 (ningen), it appears as jin or nin (on'yomi).311

わたし日本人にほんじんです。6
"I am Japanese."

あそこにひとがいます。6
"There is a person over there."

Naming the switch explicitly defuses the "wait, I thought that kanji was hito?" confusion that dominates the N5 and N4 stages.3 If you have internalized that 人 = hito standalone and 人 = jin/nin inside a compound, you have already absorbed roughly half of what the on'yomi-default rule does in practice.

Why your dictionary search feels different for jukugo

A jukugo lookup returns one headword with one canonical reading, one part of speech (often a する-verb noun), and one fused definition. The entry does not break down into the readings or meanings of the individual kanji.118 This is structural, not editorial. Dictionaries treat 熟語 as fixed lexical items because the meanings have set ("熟" = matured) beyond the sum of the parts. That is exactly the etymology the article opens with.1

This explains the "I know both kanji but cannot guess the word" failure mode at the beginner level. The on'yomi-default rule helps with reading prediction. The four construction patterns push meaning prediction most of the way to a confident guess. What they cannot do is bypass the lexical fact that some compounds carry meanings the individual kanji alone never imply.73

A note on rendaku inside jukugo

Rendaku (連濁, sequential voicing) frequently affects native-Japanese wago compounds. It affects Sino-Japanese kango compounds less often, and gairaigo compounds very rarely.1316 The canonical example is wago: 折り紙 (ori-gami, "origami"), where 紙 (kami) voices to -gami in the compound. The unvoiced form would not be a Japanese word.13

The most common everyday cases of voicing inside a compound (折り紙, 手紙, 青空 aozora, 物語 monogatari) all sit inside wago, not kango. The dedicated article on rendaku in kanji compounds gives the full rule, including Lyman's Law, which blocks voicing when the second element already contains a voiced obstruent.13

See also

References

Footnotes

  1. 三省堂編修所 (ed.). 『大辞林』第四版. 三省堂, 2019. Entry: 熟語. 2 3 4 5 6 7 8 9 10 11

  2. 国立国語研究所. 『雑誌九十種の用語用字』(1994 magazine vocabulary survey). Comparative figures cited in 山崎誠・小沼悦, 同上 (Annual Meeting of the Association for Natural Language Processing, 2004, P6-3). 2 3 4

  3. Halpern, Jack. The Kodansha Kanji Learner's Dictionary. Kodansha, 1999 (rev. 2013). Introduction on kanji compounds and the on-on reading default. 2 3 4 5 6 7 8 9

  4. Joyce, Terry; Hodošček, Bor; Nishina, Kikuko. "Orthographic representation and variation within the Japanese writing system: Some corpus-based observations." Written Language and Literacy, vol. 15(2), 2012, John Benjamins. Headword counts from Iwanami『広辞苑』第五版 (Kōjien, 5th ed., 1998) reported within.

  5. 二字熟語の構成(組み立て)12種類と一覧表. 二字熟語の百科事典. https://proverb-encyclopedia.com/two/kumiawase/ (school-grammar reference reflecting Japanese 中学校国語 curriculum classifications). (limitation): pedagogical aggregator; used only for the canonical Japanese school-grammar category names, all of which match the standard Japanese junior-high 国語 syllabus. 2 3 4 5 6 7

  6. 日本語能力試験公式ウェブサイト (JLPT official site). "N5 認定の目安" / Can-do statements. https://www.jlpt.jp/about/levelsummary.html 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

  7. Kageyama, Taro. "Word Formation." In Shibatani, Miyagawa & Noda (eds.), The Handbook of Japanese Lexicon and Word Formation. De Gruyter Mouton, 2016 (originally Blackwell, 2009 edition referenced as chapter 10). 2 3 4 5 6 7 8 9 10 11

  8. 山崎誠・小沼悦. 「現代雑誌における語種構成」. 言語処理学会第10回年次大会発表論文集, 2004. https://www.anlp.jp/proceedings/annual_meeting/2004/pdf_dir/P6-3.pdf 2 3 4

  9. 国立国語研究所 (National Institute for Japanese Language and Linguistics). 『現代雑誌九十種の用語用字』(Vocabulary and Characters in Ninety Modern Magazines), 1964. Survey of magazine corpus from 1956. Summary figures reproduced in 山崎誠・小沼悦, 「現代雑誌における語種構成」, 言語処理学会第10回年次大会発表論文集, 2004, P6-3. https://www.anlp.jp/proceedings/annual_meeting/2004/pdf_dir/P6-3.pdf 2

  10. Shibatani, Masayoshi. The Languages of Japan. Cambridge University Press, 1990, pp. 142–149 (vocabulary strata: wago, kango, gairaigo, konshugo). 2 3 4 5

  11. Wikipedia. "Kanji." Section on readings and mixed-reading patterns. https://en.wikipedia.org/wiki/Kanji (limitation): used only as a secondary index to the Japanese-school-grammar categories it inherits from 国語 textbooks; primary claims paired with 5, 14, or 7. 2 3 4 5 6

  12. Wikipedia. "Jukujikun" / "Kanji § Special readings." https://en.wikipedia.org/wiki/Kanji#Special_readings 2 3 4 5

  13. Vance, Timothy J. The Sounds of Japanese. Cambridge University Press, 2008, ch. 6 (rendaku). 2 3 4 5

  14. ベネッセ教育情報, 「重箱読み・湯桶読みとは?代表例と一緒に解説」, 中学国語 定期テスト対策. https://benesse.jp/kyouiku/teikitest/chu/japanese/japanese/c00534.html 2 3 4 5

  15. Wikipedia. "Ateji." https://en.wikipedia.org/wiki/Ateji 2 3

  16. Irwin, Mark. "Rendaku-Based Lexical Hierarchies in Japanese: The Behaviour of Sino-Japanese Mononoms in Hybrid Noun Compounds." Journal of East Asian Linguistics, 14(2), 2005, pp. 121–153. http://www-h.yamagata-u.ac.jp/~irwin/site/Home_files/Irwin,%20Rendaku-Based%20Lexical%20Hierarchies%20in%20Japanese%20The%20Behaviour%20of%20Sino-Japanese%20Mononoms%20in%20Hybrid%20Noun%20compounds,%20Journal%20of%20East%20Asian%20Linguistics,%202005.pdf 2

  17. Frellesvig, Bjarke. A History of the Japanese Language. Cambridge University Press, 2010, ch. 11 (Sino-Japanese vocabulary stratum and on'yomi layers).

  18. 新村出 (ed.). 『広辞苑』第七版. 岩波書店 (Iwanami Shoten), 2018.