Skip to main content

Lookalike Katakana: How to Tell the Most-Confused Kana Apart

Lookalike katakana cluster around a few near-identical shapes. The differences often come down to the direction of one stroke or the count of one bar. For the two headline pairs (シ vs ツ and ソ vs ン), the master test is stroke direction. The shape family around ク, ワ, ウ, and フ resolves by stroke count and the presence of a top bar.12

Overview

Katakana confusion is different from hiragana confusion. For the worst-offending pairs, it has one underlying mechanism.1 The pairs that beginners stall on are not random visual coincidences. They follow from how katakana were derived from kanji in the 9th century, and they all yield to either a stroke-direction test or a stroke-count test.12

Why katakana confusion is its own problem

Katakana was developed in the 9th century by Buddhist monks in Nara to transliterate texts from India. They did this by taking parts of man'yōgana characters, early kanji used to write Japanese sounds, as a form of shorthand.1 The script's name reflects this origin: kata (片) means "partial" or "fragmented." Each sign is one component of a kanji rather than a whole kanji simplified down.1 The Wikipedia entry gives カ (ka) as the worked example: it "comes from the left side of ka (加; lit. 'increase')."1

Hiragana developed by a different route. Each hiragana is "a simplified cursive rendering of a whole kanji," derived through the sōsho cursive style, with あ (a) coming from 安 (an).3

This component-extraction process gives katakana its short, angular, near-isolated strokes. Several characters end up with two or three strokes inside near-identical outer shapes. That is why the same handful of pairs reappear on every learner's confusion list.1

The canonical lookalike cluster

Wikipedia names the four-kana cluster explicitly: "Characters shi シ, tsu ツ, so ソ, and n ン look very similar in print except for the slant and stroke shape. These differences in slant and shape are more prominent when written with an ink brush."1

The source kanji for the ten kana covered in this article are listed below. The table is useful when a mnemonic later in the article refers to the source shape.

KanaSource kanjiNotes
Simplified in the Heian period from man'yōgana 之.4
川 / 州 / 津 / 闘Disputed; the Nihon Kokugo Daijiten lists these four candidates.5
Simplified in the Heian period from man'yōgana 曽.6
尓 (disputed)Possibly from the first two strokes of man'yōgana 尓, or from a symbol indicating the nasal sound (撥音はつおん, hatsuon).7
Simplified in the Heian period from man'yōgana 久.8
Simplified in the Heian period from man'yōgana 和.9
宇 (top part)Taken from the top part of the character.10
Simplified in the Heian period from man'yōgana 不.11
Simplified in the Heian period from man'yōgana 乎.12
Simplified in the Heian period from man'yōgana 乃.13

The master rule: stroke direction

The two headline confusions (シ vs ツ and ソ vs ン) share one underlying mechanism. シ and ン finish with an upward sweep. ツ and ソ finish with a downward sweep.214 Every other surface cue, such as slant, mark angle, and stroke-end thickness in brush fonts, follows from that direction choice.115

The sci.lang.japan FAQ pairs the four on the same horizontal-vs-vertical axis. That print-shape pattern follows from the underlying stroke direction.2 LearnTheKana groups the same kana the same way: シ and ン are "more HORIZONTAL in every aspect (both the bottom line and the dots)," while ツ, ソ, and ノ are "more VERTICAL in every aspect (both the bottom line and the dots)."14

The diagram above gives the full diagnostic for the four worst-offending kana. Stroke count then separates the 3-stroke pair (シ, ツ) from the 2-stroke pair (ソ, ン).16

How this article is organized

Each of the four core confusion groups follows the same pattern: what the kana share, the structural diagnostic, the stroke-order anchor, and one durable distinguisher. A short cross-script note at the end points to hiragana-katakana confusions, which are out of scope here, and shows where they are treated separately.

What this article does not cover

Hiragana-katakana cross-script lookalikes (り vs リ, へ vs ヘ, か vs カ, や vs ヤ) appear in the Wiktionary "Easily confused Japanese kana" appendix as a separate inventory.17 The diagnostic question there is "which script is this?" It uses a different framework: word context, surrounding script, and font weight. The within-katakana rules in this article do not transfer to it.17

Per-kana stroke instruction belongs in the dedicated katakana stroke-order article on this site. Full-chart reading drills belong in the katakana chart article. Both are referenced where relevant rather than rewritten here.

The four core within-katakana lookalike groups

The within-katakana problem reduces to four groups. The first two yield to stroke direction. The second two yield to stroke count and the presence or absence of a top bar.

The four groups at a glance

#GroupOne-line structural diagnostic
1シ vs ツTwo short marks side-by-side on the left in シ; stacked across the top in ツ.214
2ソ vs ンLong stroke runs top-down in ソ; bottom-up in ン.1815
3ク vs ワ (with ウ, フ)Stroke counts: フ = 1, ク = 2, ワ = 2, ウ = 3; ワ adds a top horizontal bar, ウ adds a top crown.1916
4ヲ vs フTop-bar count: ヲ has 2 horizontal bars on top (3 strokes); フ has none (1 stroke).1612

シ vs ツ: the stroke-direction headline pair

シ (shi) and ツ (tsu) are the single most-cited katakana confusion in beginner teaching. Both are 3-stroke kana with two short marks and one long sweeping stroke. At small print sizes, they share an almost identical outer shape.114

What they share

Both kana have stroke count 3.16 Both consist of two short marks plus one long sweeping stroke. In print, their outer shapes are nearly identical at small sizes.114

The structural difference: stroke direction

The third stroke of シ runs bottom-up, from south to north, and ends with an upward flick. The third stroke of ツ runs top-down, from north to south, and ends with a downward sweep. The sci.lang.japan FAQ states the visible consequence directly: "The lines in シ (shi) are more horizontal than vertical, whereas ツ (tsu) is more vertical than horizontal."2

LearnTheKana groups the broader cluster on the same axis: シ is in the "more HORIZONTAL" subgroup; ツ is in the "more VERTICAL" subgroup.14 Wikipedia confirms the print-vs-handwriting consequence: "Characters shi シ, tsu ツ, so ソ, and n ン look very similar in print except for the slant and stroke shape. These differences in slant and shape are more prominent when written with an ink brush."1

The short-mark orientation as a print-font cue

In static text, readers cannot watch the stroke direction. The two short marks are the visible clue. In シ, the marks stack vertically on the left and lean horizontally, aligning to the left vertical edge of the outer shape. In ツ, the marks sit across the top edge and lean vertically.14

The "S for Side, T for Top" cue is the print-font shorthand widely cited in beginner pedagogy: the short marks of Shi sit on the side; the short marks of Tsu sit on the top.1420

Stroke-order anchor

Both kana are 3 strokes.16 The first two strokes are the short marks. The third stroke is the long sweeping stroke whose direction is the diagnostic.16

A reader who can write both correctly already knows the difference, because the writing motion encodes the direction. The disambiguation mainly fails for readers who learned to recognize the kana from print without ever drawing them.

One durable distinguisher

Tofugu's image is the most-cited mnemonic for the pair: シ "looks like a smiley face, but something is wrong with it. Both eyes are sideways and stacked on top of each other like some deep sea fish," whereas ツ has the same two marks rotated so they sit "across the top," yielding "two needles and thread" rather than a face.20

The mnemonic-free test is shorter. Look at the two short marks. Stacked vertically on the left, leaning horizontal: シ. Stacked horizontally across the top, leaning vertical: ツ.214

The pair shows up adjacent inside the everyday loanword シャツ ("shirt"), so the diagnostic gets exercised inside one common word.

シャツのいろ何色なにいろ21
"What color is your shirt?"

シャツをいだ。21
"I took off my shirt."

Practitioner shorthand: sushi vs sutsu

Beginner learning communities summarize the シ/ツ slip as the "sushi vs sutsu" reading error: スシ ("sushi") misread as ステュ. The mistake happens because the only on-page difference between シ and ツ is the third-stroke direction. The framing is widespread in forum posts and YouTube tutorials. Treat it as practitioner consensus rather than an academic claim.

ソ vs ン: the same rule applied to two strokes

ソ (so) and ン (n) repeat the シ/ツ diagnostic in a smaller form. Each kana has only two strokes, so the long second stroke carries the direction signal clearly.

What they share

Both kana have stroke count 2.16 Both consist of one short mark plus one longer sweeping stroke. At small sizes, their shapes can look interchangeable.118

The structural difference: stroke direction (again)

The Japanese Page states the rule cleanly: "The small dash for 'so' points South (down). … The small dash for 'n' almost points North (up)."18 The stroke-order consequence is just as clean: "SO ソ starts at the top" and "N ン starts at the bottom."18

SoraNews adds the brush-font cue on the longer stroke: "For 'so,' that first stroke curves slightly downwards, while for 'n' it curves up. The more significant difference, though, is in the direction you write the longer stroke. For 'so,' it's a downward stroke, and for 'n' it's an upwards one. That makes 'so's' longer stroke thick at the top, and 'n's' thicker at the bottom."15

The sci.lang.japan FAQ summarizes the print-shape axis: "The lines in ン (n) are more horizontal than vertical, whereas ソ (so) is more vertical than horizontal."2

The short-mark angle as a print-font cue

The Japanese Page's alignment rule gives two clues at once. "The dash in 'so' is lower and lines up at top (almost). Also, the dash in the 'n' is higher and lines up to the left."18 In other words, ソ's short mark sits high and points south. ン's short mark sits low and points north.

The "ン looks like a lowercase n" image is implicit in the bottom-up long stroke: the rightward climb of the long stroke ends with the upward flick that resembles the right diagonal of the Latin lowercase n.18

Stroke-order anchor

Both kana are 2 strokes.16 The first stroke is the short mark. The second stroke is the long stroke, and its direction is the diagnostic.18

ン cannot start a word in Japanese.18 Position therefore helps in real text: an initial kana has to be ソ. The reverse does not hold at word-final position, because both kana can sit at the end of a katakana word, though ン is far more common there.

One durable distinguisher

The mnemonic-free test is binary: trace the long stroke's final direction. Ends going up = ン. Ends going down = ソ.1815

The word パソコン ("PC") puts ソ and ン inside the same four-mora loanword, both in non-initial positions. The up/down test fires twice in one word.

ソファーにすわった。21
"I sat on the sofa."

パソコンは使つかえる?21
"Do you know how to use a computer?"

パンった。21
"I bought some bread."

Where the error costs the most

The ソ vs ン error matters most when readers sound out unfamiliar transliterated names, such as foreign names in katakana like ジョンソン or ハンソン. In those cases, the kana carries sound information that cannot be recovered from context.15

ク vs ワ (and the ウ/フ neighborhood): the silhouette family

The second cluster does not yield to stroke direction. ク, ワ, ウ, and フ share an outer shape: a top-right corner curving down-left into a sweeping stroke. They differ in what sits inside or on top of that shape.219 The diagnostic is shape and stroke count.

What they share

All four kana share an outer shape: a top-right corner curving down-left into a sweeping stroke.219 Wiktionary's "Easily confused Japanese kana" inventory groups five kana on this shape axis (ウ, ワ, フ, ラ, ヲ). The in-scope four for this section are ウ, ワ, フ, plus ク from the adjacent タ/ク/ヌ cluster.17

The four diagnostics

Stroke counts are the cleanest single diagnostic in this group. フ = 1 stroke, ク = 2 strokes, ワ = 2 strokes, ウ = 3 strokes.16

Each kana's structural features follow from those counts:

  • is the envelope alone, drawn as a single continuous stroke. No top crown, no internal stroke.16
  • is the envelope plus a short diagonal stroke inside the top of the envelope (2 strokes). No top horizontal bar, no top crown.16
  • is the envelope plus a horizontal bar across the top of the envelope (2 strokes). No top crown.19
  • is the envelope plus the top horizontal bar plus a small crown stroke on top of the bar (3 strokes). The crown is the diagnostic for ウ specifically.1916

LearnTheKana frames the ウ/ワ split directly: ウ and ワ share "a slight vertical dip coming down from the horizontal line towards the left side." ウ has "a small vertical stick coming up from the middle" while ワ "lacks this element."19 The sci.lang.japan FAQ phrases the same split more tightly: "ウ (u) has a small line on the top but ワ (wa) has none."2

The diagram captures the roof-and-hat hierarchy. フ has neither. ク has neither but adds an internal diagonal. ワ has a roof. ウ has a roof with a hat on top.

Stroke-order anchor

Stroke counts differ usefully across the four: フ = 1, ク = 2, ワ = 2, ウ = 3.16 Counting strokes alone disambiguates フ and ウ. The ク/ワ pair is the only stroke-count tie in the group. The presence or absence of the top horizontal bar is the structural diagnostic between them.

One durable distinguisher

The roof-and-hat hierarchy is the entire diagnostic, and the image is rotation-invariant; it works at handwriting size and at print size.19

フ has no roof. ク has no roof and adds an internal short diagonal. ワ has a roof (the top horizontal bar). ウ has a hat on top of the roof.1916

The image lines up with the etymology for ウ. The source kanji 宇 contributed its "top part" to the modern shape, so the crown-on-bar shape preserves the upper structure of the source.10

ワインです。21
"It's wine."

どこのクラスなの?21
"Which class are you in?"

The word クラス sits ク next to ラ, so the eye must distinguish the no-roof envelope (ク) from an adjacent shape immediately. ワイン puts ワ in initial position, where the roof bar is visually most prominent.

ヲ vs フ: the rare-kana mirror

ヲ (wo) and フ (fu) share the same outer shape as the previous group. ヲ stacks two horizontal bars on top of it. The diagnostic is just the count of those top bars, and the confusion is small in practice because ヲ is rare in modern text.

What they share

Both end with a sweeping curve to the lower-left.16 ヲ adds two horizontal bars across the top of the same outer shape that ク, ワ, and ウ also share.1612

The structural difference

ヲ has 3 strokes: two horizontal bars on top plus the sweeping bottom stroke. フ has 1 stroke: the bare outer shape only.16 The horizontal-bar count is the entire diagnostic.

Where you will actually see ヲ

ヲ derives from the man'yōgana kanji 乎, simplified in the Heian period.12 In modern Japanese, ヲ is "seldom used." The hiragana を "is used almost exclusively as the direct object particle, and as particles are usually written in hiragana." The "wo" sound in foreign words is rendered with ウォ rather than ヲ.12

All-katakana text, which would force every particle including を into katakana, is rare in modern usage.22 ヲ survives in retro all-katakana video games, such as Downtown Nekketsu Monogatari / River City Ransom, the original Metal Gear, and Moero!! Junior Basket. It also appears in stylized contexts like ヲタク, a katakana respelling of otaku for ironic or subcultural effect.22

For a beginner reader, the practical result is uneven exposure: a learner reading mostly contemporary material will encounter フ many times for each ヲ.1222

Hiragana to katakana cross-script lookalikes

Cross-script confusions (り vs リ, へ vs ヘ, か vs カ, や vs ヤ) are listed separately in the Wiktionary "Easily confused Japanese kana" appendix.17 The diagnostic question is different: "which script is this?" rather than "which kana within katakana?"

It depends on word context, surrounding script, and font weight rather than on a single structural feature within one script.17 The within-katakana rules in this article (stroke direction, stroke count, top-bar count) do not transfer to cross-script confusion.

The hiragana side of the broader script-confusion problem (within-hiragana lookalikes) is covered in this pillar's lookalike-hiragana article. The diagnostic framework there uses loop count, crossbar count, and stroke-ending features, not stroke direction.

Good to know

Stroke direction is the diagnostic, not just a writing tip

The stroke-direction rule that distinguishes ソ/ン and シ/ツ is the same rule writers follow to draw the kana correctly. Print fonts preserve the directional cue as slant, mark angle, and stroke-end thickness.15 A reader who learns to write the four kana correctly automatically learns the diagnostic for reading them.15

The stroke-end thickness cue is most visible in brush fonts. SoraNews notes: "that makes 'so's' longer stroke thick at the top, and 'n's' thicker at the bottom."15 Modern Gothic and Mincho fonts compress that thickness signal, but the slant and the dot orientation survive.15

Reading シ as ツ on a low-resolution screen

At small sizes on low-resolution screens, the slant differences between シ/ツ and ソ/ン can compress until the kana look nearly identical. A reader who scans ニュース and momentarily sees ニューフ is being defeated by the typeface, not by their eyes.

The fix is to zoom in or switch to a Japanese-targeted font, such as Hiragino Kaku Gothic, Yu Gothic, or Noto Sans JP. SoraNews specifically warns that "those stroke order/thickness clues can often disappear with modern, blockier fonts," and beginner forums report the same problem inside small-font menus and vending-machine displays.15

The "sushi vs sutsu" beginner error

Reading the シ in スシ ("sushi") as ツ yields a nonsense word, conventionally written ステュ in the framing widely cited in learner communities. The fix is the stroke-direction rule: the third stroke of シ runs bottom-up. The third stroke of ツ runs top-down.215

The シ/ツ pair is the single most-cited katakana confusion in beginner teaching, and the "sushi vs sutsu" framing is the standard example in learner communities.115 It is practitioner consensus rather than an academic finding.

Mnemonics work, but they are scaffolding

Tofugu's smiling-face image for シ/ツ, the lowercase-n image for ン, and the roof-and-hat hierarchy for ク/ワ/ウ are all useful at the absolute-beginner stage.21920 They are teaching aids, not part of the language. Discard them once recognition is automatic.

LearnTheKana's mnemonic set covers the full シ/ン/ツ/ソ/ノ cluster as a family. シ is a "female face" with "a mouth and two eyes tilted on the side." ツ is the "tsunami" image with "the strongest wave of all." ソ is the "soft" wave with one dot. ン is "the same face except one of the eyes are closed." ノ has "no waves" at all.14

Handwriting exaggerates the diagnostic; print suppresses it

The slant and stroke-end-direction cues are most prominent in brush calligraphy and in hand-printed teaching fonts like Kyokasho-tai.1 Wikipedia makes the point directly: "these differences in slant and shape are more prominent when written with an ink brush."1

Mincho and Gothic fonts on screen normalize the shapes. That is why a kana reader who has no trouble on a textbook page sometimes stalls on a vending-machine display. SoraNews's observation that stroke-end thickness "can often disappear with modern, blockier fonts" describes the same compression effect.15

Font matters: pick a teaching font for early reading

Japanese-targeted fonts, such as Hiragino Kaku Gothic, Yu Gothic, Noto Sans JP, and Kyokasho-tai for textbook style, render the diagnostic features more crisply than Latin-first fallback fonts. If a study text shows two katakana as visually identical, suspect the typeface first. The directional cues that survive in native Japanese fonts are exactly the ones that compress in fallback fonts that fake Japanese glyphs.15

The confusion fades with reading volume, not flashcard volume

After the first month, lookalike-katakana errors usually come from reading speed inside connected text, not from per-kana recognition. Once the binary tests above feel automatic in isolation, replace isolated-kana flashcard time with reading volume: loanword-heavy menus, product packaging, and brand names.

Hiragana lookalikes are a different problem

Within-hiragana confusions (ぬ vs め, わ vs れ vs ね, and so on) are disambiguated by loop count, crossbar count, and stroke-ending features, not by stroke direction.17 The diagnostic framework in this article does not transfer to hiragana, and the hiragana framework does not transfer back to katakana.17

See also

References

Footnotes

  1. Wikipedia. "Katakana." Section "History" and the kanji-derivation chart. https://en.wikipedia.org/wiki/Katakana 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

  2. sci.lang.japan FAQ. "How can I distinguish similar kana?" https://www.sljfaq.org/afaq/similar-kana.html 2 3 4 5 6 7 8 9 10 11 12 13

  3. Wikipedia. "Hiragana." Section "History" and the man'yōgana derivation paragraph. https://en.wikipedia.org/wiki/Hiragana

  4. Wiktionary. "シ." Etymology section. https://en.wiktionary.org/wiki/%E3%82%B7

  5. Wiktionary. "ツ." Etymology section (disputed; 川 / 州 / 津 / 闘 candidates listed in Nihon Kokugo Daijiten). https://en.wiktionary.org/wiki/%E3%83%84

  6. Wiktionary. "ソ." Etymology section. https://en.wiktionary.org/wiki/%E3%82%BD

  7. Wiktionary. "ン." Etymology section (disputed; possibly from 尓 or from a 撥音 nasal marker). https://en.wiktionary.org/wiki/%E3%83%B3

  8. Wiktionary. "ク." Etymology section. https://en.wiktionary.org/wiki/%E3%82%AF

  9. Wiktionary. "ワ." Etymology section. https://en.wiktionary.org/wiki/%E3%83%AF

  10. Wiktionary. "ウ." Etymology section. https://en.wiktionary.org/wiki/%E3%82%A6 2

  11. Wiktionary. "フ." Etymology section. https://en.wiktionary.org/wiki/%E3%83%95

  12. Wiktionary. "ヲ." Etymology section and modern-usage notes. https://en.wiktionary.org/wiki/%E3%83%B2 2 3 4 5 6

  13. Wiktionary. "ノ." Etymology section. https://en.wiktionary.org/wiki/%E3%83%8E

  14. LearnTheKana. "Mnemonics for the シ (shi), ン (n), ツ (tsu), ソ (so), ノ (no) Characters." https://learnthekana.com/shi-n-tsu-so-no/ 2 3 4 5 6 7 8 9 10

  15. SoraNews24. "How to tell Japanese's two most confusing, nearly identical characters apart from each other." https://soranews24.com/2019/08/30/how-to-tell-japaneses-two-most-confusing-nearly-identical-characters-apart-from-each-other/ 2 3 4 5 6 7 8 9 10 11 12 13 14

  16. Katakana Stroke Order App. Per-character stroke-count pages (one per kana). https://hiragana.strokeorder.app/katakana 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

  17. Wiktionary. "Appendix: Easily confused Japanese kana." https://en.wiktionary.org/wiki/Appendix:Easily_confused_Japanese_kana 2 3 4 5 6 7

  18. The Japanese Page. "Confusing Katakana: How to Tell ン (n) and ソ (so) Apart." https://www.thejapanesepage.com/confusing-katakana-how-to-tell-%E3%83%B3-n-and-%E3%82%BD-so-apart/ 2 3 4 5 6 7 8 9

  19. LearnTheKana. "Mnemonics for the ウ (u), ワ (wa), フ (fu), ラ (ra), ヲ (wo/o) Characters." https://learnthekana.com/u-wa-fu-ra-wo/ 2 3 4 5 6 7 8 9

  20. Tofugu. "Learn Katakana: Tofugu's Ultimate Guide." https://www.tofugu.com/japanese/learn-katakana/ 2 3

  21. Tatoeba Project. Japanese-English parallel sentence corpus (per-sentence verification via the public API; per-sentence IDs cited inline). https://tatoeba.org/ 2 3 4 5 6 7

  22. Wiktionary. "Wo (kana)" entry, and Wikipedia. "Wo (kana)." Notes on modern usage of ヲ (rare; all-katakana text, retro video-game menus, stylized usage). https://en.wikipedia.org/wiki/Wo_(kana) 2 3