greymoth proof, not pitch Writing · Tokyo

field notes · canonical home

Writing

I find the places where software quietly mishandles real human text — Unicode normalization, IME composition, East Asian width, surrogate and grapheme slicing, bidi. The boundary conditions that pass every ASCII test and break the moment someone types a real name. Each note here starts from one I found in production open source, fixed with a diff a maintainer can accept or reject, and pinned with a test that keeps it fixed. No think-pieces — field notes with a diff attached.

CJK and Japanese is where these failures bite hardest, and where my evidence runs deepest — a corpus of 97 real ones across 91 libraries. That's the proof, not the identity: the same boundary bugs hit anyone who owns text at scale — platform, quality, compiler, browser and foundation-model teams. Shorter cuts of each note run on dev.to, Zenn and Hacker News, all pointing their canonical link back here — so this is the version they came from, cross-linked into the corpus, and the one search should credit.

Field notes

02
2026-07-04· IME composition· JavaScript · React · Vue

The keypress that submits your form mid-word.

An input-boundary failure: the Enter that commits a text composition is the same Enter your handler reads as submit, so the form fires while the user is still choosing a word. The one-line isComposing fix — why it is four different lines across React, Vue, Safari and native — the spec-level root cause, and the same keypress traced through misskey, llm-x, Safari mentions and rsuite. The largest family in the corpus: 38 of 97.

read the field note →
01
2026-07-03· surrogate & grapheme· JavaScript

A width check said the string was safe to cut. It split a kanji in half.

A text-boundary failure: measure by one unit, cut by another. A one-line fast path in cli-table3 sliced 𠮷 (U+20BB7) mid-surrogate. The full diagnosis, the fix, the failing fixture — and the same shape traced through four more libraries (one of them Hindi, not CJK): opentype.js, slate, clerk, markdown-it.

read the field note →

Two notes so far. This page grows as the corpus does — every entry that's worth more than a corpus card gets the long form here first, then a shorter cut goes out to the mirrors below.

Where the shorter cuts run

dev.to / greymothjpcanonical → here x.com / greymoth__threads the failure corpus97 entries

The corpus is the raw catalog — one card per bug. The writing is where a few of them get the full treatment: why the bug survived, why the fix is the shape it is, and where the next one hides.

← greymoth — the record browse the corpus