CJK / Unicode Failure Corpus

IME composition

When a Japanese/Chinese/Korean user types, they press Enter to confirm an IME conversion (pick a kanji candidate). That same Enter often fires the component's own keydown / submit / select handler, so the form sends or an item is selected with half-finished text. The guard is one line: skip the handler while the composition is active (event.isComposing, or keyCode 229; in React read event.nativeEvent.isComposing).

IME composition Vue merged

naive-ui: Enter during IME composition adds a tag (Vue dynamic tags)

naive-ui · tusen-ai/naive-ui

Symptom

Pressing Enter to confirm a kana to kanji conversion in an n-dynamic-tags input creates a tag from the in-progress text instead of just finishing the conversion.

Minimal repro

1. Render <n-dynamic-tags> and focus its input.
2. Switch to a Japanese IME, type "とうきょう", press Space to convert to 東京.
3. Press Enter to pick the candidate.
4. A tag is added from the unconfirmed text before the conversion commits.

Fix

Skip tag creation while e.isComposing is true; only act on the Enter that fires after compositionend.

Merged PR → #naive-ui-dynamic-tags-ime

IME composition React open

React chat input sends the message on the Enter that confirms an IME conversion

llm-x · mrdjohnson/llm-x

Symptom

Confirming a Japanese IME candidate with Enter in the chat box sends the message instead of committing the conversion.

Minimal repro

1. Open the chat input.
2. With a CJK IME, type a phrase and press Space to get conversion candidates.
3. Press Enter to choose a candidate.
4. The message sends with the unconfirmed text.

Fix

Return early when e.nativeEvent.isComposing is true. React's synthetic KeyboardEvent has no isComposing field, so read it off nativeEvent (the DOM event), not e.isComposing.

Fix PR → #llm-x-chat-enter-ime

IME composition Svelte open

Svelte command palette runs a command on the IME confirm Enter

surf · deta/surf

Symptom

Typing a CJK query in the command palette and pressing Enter to confirm the IME runs the highlighted command instead of finishing the conversion.

Minimal repro

1. Open the command palette.
2. Compose a Japanese search term with the IME.
3. Press Enter to commit the conversion.
4. The first/highlighted command fires prematurely.

Fix

Add `if (e.isComposing) return` at the top of the palette input keydown handler. Svelte exposes the native KeyboardEvent, so e.isComposing works directly.

Fix PR → #surf-command-palette-ime

IME composition ReactSafari open

Safari: Enter confirming a CJK @-mention picks the wrong option (rc-mentions)

rc-mentions · react-component/mentions

Symptom

On Safari/WebKit, composing an @-mention name in CJK and pressing Enter to confirm the IME replaces the text with whatever mention is highlighted.

Minimal repro

1. In Safari, type @ then compose a Japanese name with the IME.
2. Press Enter to confirm the kanji conversion.
3. The composing text is replaced by the highlighted mention option.

Fix

In the ENTER branch, return early on event.nativeEvent.isComposing before preventDefault. WebKit reports the commit keydown as which===Enter + isComposing:true; Chromium reports keyCode 229 and never enters this branch.

Fix PR → #rc-mentions-safari-ime-select

IME composition React open

rc-select: the Enter that confirms IME composition also selects an option

rc-select · react-component/select

Symptom

The Enter used to commit an IME composition also selects the currently highlighted option in a searchable Select.

Minimal repro

1. Open a searchable <Select>.
2. Type a CJK query with the IME and press Enter to confirm the conversion.
3. The active option is selected with the unfinished query.

Fix

Track composition state (compositionstart/compositionend) and skip option selection while composing.

Fix PR → #rc-select-enter-ime-option

IME composition Vue open

Element Plus time-picker reacts to keystrokes during IME composition

element-plus · element-plus/element-plus

Symptom

The time-picker's key handler fires while an IME composition is active, so keystrokes meant for the IME mutate the time value.

Minimal repro

1. Focus the time-picker input.
2. Begin an IME composition.
3. Composition keystrokes are intercepted by the picker's key handler instead of the IME.

Fix

Skip the picker's keydown handling while the composition is active (isComposing guard).

Fix PR → #element-plus-time-picker-ime

IME composition Vue open

Vuetify VAutocomplete keydown fires during IME composition

vuetify · vuetifyjs/vuetify

Symptom

VAutocomplete's keydown listener triggers during IME composition, so confirming a CJK query navigates or selects options.

Minimal repro

1. Open a VAutocomplete.
2. Compose a Japanese query, press Enter/arrow to confirm.
3. The keydown listener acts on the composing keys.

Fix

Bail out of the keydown listener when e.isComposing is true.

Fix PR → #vuetify-autocomplete-ime

IME composition VueNuxt open

Nuxt UI defineShortcuts fire while typing with a Japanese IME

nuxt/ui · nuxt/ui

Symptom

Single-key shortcuts registered with defineShortcuts fire while composing text, so romaji keystrokes trigger app shortcuts mid-composition.

Minimal repro

1. Register a single-key shortcut (e.g. 'g').
2. In an input, compose Japanese text whose romaji includes that key.
3. The shortcut fires during composition.

Fix

Ignore key events when e.isComposing (or keyCode 229) before dispatching shortcuts.

Fix PR → #nuxt-ui-shortcuts-ime

IME composition headless merged

Zag color-picker input handles Enter during IME composition

zag · chakra-ui/zag

Symptom

A color-picker channel input runs its Enter handler while an IME composition is active, committing before the composition ends.

Minimal repro

1. Focus a color channel input.
2. Trigger an IME composition and press Enter.
3. The channel commit fires before compositionend.

Fix

Guard the channel input's Enter handler with isComposing.

Merged PR → #zag-color-picker-channel-ime

IME composition ReactElectron open

Inline file rename submits during IME composition (Cherry Studio, Electron)

cherry-studio · CherryHQ/cherry-studio

Symptom

Renaming a file inline and pressing Enter to confirm a CJK name submits the rename mid-composition.

Minimal repro

1. Start an inline file rename.
2. Compose a Japanese filename with the IME.
3. Press Enter to confirm the conversion; the rename submits with the unfinished name.

Fix

Skip the rename submit while the IME is composing (isComposing guard).

Fix PR → #cherry-studio-rename-ime

IME composition Angular open

CopilotKit Angular chat submits on the Enter that confirms an IME candidate

CopilotKit · CopilotKit/CopilotKit

Symptom

The Angular chat input submits the message on the Enter that confirms an IME composition.

Minimal repro

1. Type a CJK message in the Angular chat input.
2. Press Enter to confirm the IME conversion; the message sends early.

Fix

Check event.isComposing (Angular passes the native KeyboardEvent) before submitting.

Fix PR → #copilotkit-angular-chat-ime

IME composition React open

ChatUI composer sends on the Enter that confirms IME composition

ChatUI · alibaba/ChatUI

Symptom

The Composer sends the message on the Enter used to confirm an IME composition.

Minimal repro

1. Compose CJK text in the Composer.
2. Press Enter to pick a candidate; the message sends instead of committing.

Fix

Skip send when e.nativeEvent.isComposing is true.

Fix PR → #chatui-composer-ime

IME composition open

SiYuan session rename input submits during IME composition (Enter)

siyuan · siyuan-note/siyuan

Symptom

Pressing Enter to confirm CJK composition in the agent session rename input also triggers submit/rename, discarding the composed text or submitting prematurely.

Minimal repro

Use Japanese/Chinese/Korean IME in siyuan's session rename field; press Enter to confirm composition; rename fires before composition is complete.

Fix

Check event.isComposing (and event.keyCode===229 for legacy browsers) before acting on keydown/keyup Enter. Ignore Enter events during composition.

Fix PR → #siyuan-session-rename-ime

IME composition React open

llmchat: ignore the Enter that confirms IME composition in the chat input

llmchat · trendy-design/llmchat

Symptom

CJK users pressing Enter to accept an IME candidate also submits the chat message, sending an empty or partial message.

Minimal repro

Type Japanese in llmchat chat input using IME; press Enter to select a candidate; message sends immediately.

Fix

Guard Enter keydown handler with if (event.isComposing || event.nativeEvent?.isComposing) return.

Fix PR → #llmchat-chat-enter-ime

IME composition Vue open

AIaW (Vue) chat submits on the Enter that confirms IME composition

AIaW · NitroRCr/AIaW

Symptom

Confirming CJK IME candidate with Enter in AIaW chat input simultaneously submits the message.

Minimal repro

Japanese IME → type → Enter to confirm → message sent before input complete.

Fix

Ignore Enter keydown when isComposing is true.

Fix PR → #aiaw-chat-enter-ime

IME composition React open

Obsidian Smart Composer chat submits on the IME Enter (macOS/Windows)

obsidian-smart-composer · glowingjade/obsidian-smart-composer

Symptom

Obsidian Smart Composer chat submits messages on IME Enter key in macOS/Windows with CJK input methods.

Minimal repro

Use Japanese IME in Obsidian Smart Composer chat; confirm composition with Enter; chat sends.

Fix

Check isComposing flag before processing Enter keydown.

Fix PR → #obsidian-smart-composer-chat-ime

IME composition React merged

Onyx ListFieldInput accepts an entry on the IME confirmation Enter

onyx · onyx-dot-app/onyx

Symptom

ListFieldInput in Onyx web accepts a list entry on the IME confirmation Enter, adding an empty or partial item.

Minimal repro

ListFieldInput with CJK IME → Enter to confirm → list item added prematurely.

Fix

Skip Enter action when event.nativeEvent.isComposing is true.

Merged PR → #onyx-listfield-ime

IME composition React open

RSuite InputPicker selects an item on the IME confirmation Enter

rsuite · rsuite/rsuite

Symptom

RSuite InputPicker selects the focused item when CJK IME confirmation Enter is pressed, skipping intended input.

Minimal repro

RSuite InputPicker with CJK IME → search → Enter to confirm composition → list item selected prematurely.

Fix

Guard item selection Enter handler with isComposing check.

Fix PR → #rsuite-inputpicker-ime

IME composition React open

Dify tag input adds a tag on the IME confirmation Enter

dify · langgenius/dify

Symptom

Dify tag input adds a tag on IME confirmation Enter, creating malformed tags for CJK users.

Minimal repro

Dify web tag input → Japanese IME → Enter to confirm → tag added prematurely.

Fix

Check isComposing before handling Enter in tag input.

Fix PR → #dify-tag-input-ime

IME composition React open

Flowise text inputs fire on the Enter that confirms IME composition

Flowise · FlowiseAI/Flowise

Symptom

Multiple Flowise text inputs fire on IME Enter, interrupting CJK text composition.

Minimal repro

Flowise text inputs with CJK IME → Enter to confirm → premature action fired.

Fix

Add isComposing guard to all remaining Enter keydown handlers.

Fix PR → #flowise-text-inputs-ime

IME composition React open

Langflow inline rename commits on the IME confirmation Enter

langflow · langflow-ai/langflow

Symptom

Langflow inline rename inputs commit the rename on IME Enter, saving partial CJK names.

Minimal repro

Langflow node rename → CJK IME → Enter to confirm composition → rename saved with partial text.

Fix

Guard Enter handler in inline rename with event.nativeEvent?.isComposing check.

Fix PR → #langflow-inline-rename-ime

IME composition React merged

Twenty CRM chat send and rename fire on the IME confirmation Enter

twenty · twentyhq/twenty

Symptom

Twenty CRM chat-thread send and attachment rename both fire on IME Enter, breaking CJK workflows.

Minimal repro

Twenty CRM chat or attachment rename → CJK IME → Enter to confirm → send/rename triggered.

Fix

Check event.nativeEvent?.isComposing before handling Enter.

Merged PR → #twenty-chat-rename-ime

IME composition React open

Excalidraw search navigation fires arrow/Enter keys during IME composition

excalidraw · excalidraw/excalidraw

Symptom

Excalidraw search navigation (arrow keys, Enter) fires during CJK IME composition, moving selection mid-composition.

Minimal repro

Excalidraw search → CJK IME → type → Arrow/Enter during composition → result navigation fires.

Fix

Guard navigation key handlers with event.isComposing check.

Fix PR → #excalidraw-search-nav-ime

IME composition React merged

Payload CMS SearchInput submits on the IME confirmation Enter

payload · payloadcms/payload

Symptom

Payload CMS SearchInput submits on IME confirmation Enter, bypassing the intended search query.

Minimal repro

Payload admin search → CJK IME → Enter to confirm → search fires with partial query.

Fix

Ignore Enter in SearchInput when event.nativeEvent?.isComposing is true.

Merged PR → #payload-searchinput-ime

IME composition merged

Trilium kanban card/column title editing commits on the IME Enter

Trilium · TriliumNext/Trilium

Symptom

Trilium kanban card and column title editing commits on IME Enter, saving partial CJK text.

Minimal repro

Trilium board → edit card title → CJK IME → Enter to confirm → title saved with partial text.

Fix

Check isComposing before committing title edits on Enter.

Merged PR → #trilium-board-title-ime

IME composition React merged

big-AGI custom instruction field submits on the IME confirmation Enter

big-AGI · enricoros/big-AGI

Symptom

big-AGI custom instruction field fires on IME Enter, submitting partial CJK instructions.

Minimal repro

big-AGI custom instruction → CJK IME → Enter to confirm → instruction submitted prematurely.

Fix

Skip Enter action when event.nativeEvent?.isComposing is true.

Merged PR → #big-agi-custom-instruction-ime

IME composition React open

LibreChat prompt name, label, and tag inputs fire on the IME Enter

LibreChat · danny-avila/LibreChat

Symptom

LibreChat prompt name, label, and tag inputs all trigger on IME Enter, causing broken CJK input across multiple components.

Minimal repro

LibreChat prompt editor → tag or name field → CJK IME → Enter to confirm → submit fires.

Fix

isComposing guard across prompt name, label, and dynamic-tag Enter handlers.

Fix PR → #librechat-prompt-tags-ime

IME composition React open

Jan rename and project dialogs accept a partial name on the IME Enter

jan · janhq/jan

Symptom

Jan AI assistant rename and project dialogs accept partial CJK names when Enter confirms IME composition.

Minimal repro

Jan rename dialog → CJK IME → Enter to confirm → dialog committed with partial name.

Fix

Guard Enter submit with event.nativeEvent?.isComposing check.

Fix PR → #jan-rename-dialog-ime

IME composition React cited closed

React onChange fires during IME composition in controlled inputs

React · facebook/react

Symptom

In a controlled <input>/<textarea>, onChange fires for the intermediate keystrokes of an IME composition, so any onChange-driven search or filter runs on half-finished CJK text.

Minimal repro

Type Chinese/Japanese through an IME into a controlled input that filters on onChange; the handler runs on every uncommitted composition keystroke.

Fix

Track compositionstart/compositionend and suppress onChange handling while a composition is active.

Upstream issue → #react-controlled-ime-onchange

IME composition MAUIWindows cited open

.NET MAUI Entry.Completed fires on the Enter that confirms an IME candidate (Windows)

dotnet/maui · dotnet/maui

Symptom

On Windows, Entry.Completed is raised when the user presses Enter inside the IME conversion candidate window, so completion fires before the CJK conversion is committed.

Minimal repro

Focus a MAUI Entry on Windows, use a Chinese/Japanese IME, press Enter to pick a candidate; Completed fires mid-conversion.

Fix

Do not raise Completed while the IME conversion window is open; fire only once the composition is fully committed.

Upstream issue → #maui-entry-completed-ime

IME composition spec cited open

compositionend vs input event order differs across browsers (isComposing guard)

w3c/uievents · w3c/uievents

Symptom

The spec orders input before compositionend (Chrome and Safari follow it), but Firefox and Edge fire input after compositionend, so isComposing-based guards behave differently per browser.

Minimal repro

Log the input and compositionend events while committing a CJK composition in Chrome versus Firefox; their relative order differs.

Fix

Define a canonical order in the spec; frameworks should not assume input fires before compositionend.

Upstream issue → #uievents-compositionend-input-order

IME composition spec cited open

insertCompositionText / isComposing can't single out the IME commit input event

w3c/input-events · w3c/input-events

Symptom

Every input event during a composition, including the one that commits it on Enter, carries isComposing=true and inputType insertCompositionText, so code cannot detect the commit without also listening for compositionend.

Minimal repro

Type s, i, Space to convert to a kanji, change the candidate, then press Enter to commit; the input events for each step are indistinguishable.

Fix

Assign a distinct inputType (or property) to the input event that commits the composition.

Upstream issue → #input-events-insertcompositiontext

IME composition Zed cited open

Zed Vim jk escape depends on the CJK IME input mode

zed · zed-industries/zed

Symptom

With Vim mode and a CJK IME, the jk insert-mode escape only fires while the IME is composing in Chinese mode; in direct English input it inserts a literal j and k instead of escaping.

Minimal repro

Enable Vim mode, map jk to escape, use a CJK IME such as macOS Pinyin; jk escapes while composing in Chinese mode but not in English mode.

Fix

Detect the jk key sequence consistently regardless of the IME composition state.

Upstream issue → #zed-vim-jk-escape-ime

IME composition ZedWindows cited open

Zed: text shifts vertically while composing with a Chinese IME on Windows

zed · zed-industries/zed

Symptom

While composing Chinese text with an IME on Windows, the line of text jumps vertically as the composition updates, which disrupts reading and editing.

Minimal repro

Open Zed (Vim mode) on Windows 11 and type Chinese characters through an IME; the text shifts vertically during composition.

Fix

Keep the line baseline stable while updating the IME marked-text region during composition.

Upstream issue → #zed-chinese-ime-text-shift

IME composition WarpmacOS cited open

Warp does not render IME preedit (marked) text — blind composition

Warp · warpdotdev/warp

Symptom

Warp does not render IME marked (preedit) text, so dead keys and CJK composition show nothing until committed and the user composes blind.

Minimal repro

In Warp on macOS, press Option+E (or start a CJK composition); no marked text is shown to indicate the in-progress input.

Fix

Render marked text inline via the platform IME API (NSTextInputClient on macOS) so the preedit is visible at the cursor.

Upstream issue → #warp-marked-text-ime

Kana / romaji

8 entries open category page →

Transliteration tables that drop or reverse kana. Round-trip is the oracle: kana to romaji and back should be stable, and a sibling (hiragana vs katakana) usually already does it right.

Kana / romaji JS open

Katakana ン loses the syllabic-n apostrophe in Hepburn romanization

hepburn · lovell/hepburn

Symptom

Katakana ン before a vowel or Y is romanized without the apostrophe, unlike hiragana ん. シンヨウ becomes SHINYOU (should be SHIN'YOU), so it collides with シニョウ.

Minimal repro

const { fromKana } = require('hepburn')
fromKana('しんよう') // SHIN'YOU
fromKana('シンヨウ') // SHINYOU  <- apostrophe dropped

Fix

Map katakana ン to N' (matching hiragana) and add the ンー long-vowel digraph as N', so N is never a katakana map key and toKatakana still round-trips (PAN to パン).

Fix PR → #hepburn-katakana-n-apostrophe

Kana / romaji Python open

pykakasi: missing っでぃ (ddi) sokuon in Hepburn/Kunrei romaji

pykakasi · miurahr/pykakasi

Symptom

The geminated d + small i sequence っでぃ (ddi) has no Hepburn or Kunrei entry, so loanword spellings that use it romanize incorrectly.

Minimal repro

Convert a word containing っでぃ; the doubled d before the small i is not produced.

Fix

Add the missing っでぃ (ddi) sokuon entry to both the Hepburn and Kunrei tables.

Fix PR → #pykakasi-ddi-sokuon

Kana / romaji JS open

Historical kana ゐ/ゑ (wi/we) missing from romaji conversion

romaji-conv · koozaki/romaji-conv

Symptom

The historical kana ゐ (wi) and ゑ (we) are absent from the mapping, so text containing them is dropped or left unconverted.

Minimal repro

Convert text containing ゐ or ゑ (or katakana ヰ/ヱ); the characters pass through unmapped.

Fix

Add ゐ/ゑ and their katakana counterparts to the kana mapping.

Fix PR → #romaji-conv-wi-we-kana

Kana / romaji Python open

jaconv kana2alphabet does not romanize small katakana ヵ/ヶ

jaconv · ikegami-yukino/jaconv

Symptom

kana2alphabet does not handle the small katakana ヵ/ヶ (small ka/ke), so counters like 一ヶ月 are mis-romanized.

Minimal repro

import jaconv
jaconv.kana2alphabet('ヶ')  # small ke not handled

Fix

Add the small ka/ke mappings to kana2alphabet.

Fix PR → #jaconv-small-ka-ke

Kana / romaji JS open

Reversed ヲ/ヺ dakuten mapping when adding voiced marks (jaco-js)

jaco-js · YusukeHirao/jaco-js

Symptom

The ヲ to ヺ voiced-mark mapping is reversed, so adding or stripping a dakuten on ヲ produces the wrong character.

Minimal repro

Add a voiced mark to ヲ; expected ヺ, got the wrong glyph because the mapping is reversed.

Fix

Correct the reversed ヲ/ヺ entry in both addVoicedMarks and combineSoundMarks.

Fix PR → #jaco-js-wo-voiced-mark

Kana / romaji JS open

Romaji conversion drops the z in づ (outputs u instead of zu)

kana-romaji · mtomim/kana-romaji

Symptom

kana-romaji library drops the 'z' consonant when romanizing づ, outputting 'u' instead of 'zu'.

Minimal repro

romanize('づ') returns 'u' instead of 'zu'.

Fix

づ should romanize as 'zu' (Hepburn). Map づ→zu in the kana table.

Fix PR → #kana-romaji-zu-dropped

Kana / romaji Python cited closed

pykakasi fails to romanize half-width katakana with voiced marks

pykakasi · miurahr/pykakasi

Symptom

pykakasi does not romanize half-width katakana correctly, particularly when a half-width voiced or semi-voiced mark (U+FF9E / U+FF9F) follows the base kana.

Minimal repro

Convert half-width katakana such as a half-width ka followed by a half-width dakuten; the output is wrong instead of 'ga'.

Fix

NFKC-normalize half-width katakana and combining voiced marks to their full-width equivalents before romanization.

Upstream issue → #pykakasi-halfwidth-katakana

Kana / romaji Python cited closed

Python unidecode mangles half-width katakana with dakuten/handakuten

unidecode · avian2/unidecode

Symptom

Python unidecode transliterates half-width katakana carrying dakuten/handakuten incorrectly, producing artifacts, while hiragana and full-width katakana romanize correctly.

Minimal repro

unidecode on half-width ba bi bu be bo returns a wrong string instead of 'babibubebo'.

Fix

Pre-compose half-width katakana plus combining voiced marks (NFKC) before the transliteration lookup.

Upstream issue → #unidecode-halfwidth-dakuten

Width / normalization

5 entries open category page →

Full-width to half-width, long-vowel marks, and kana range boundaries. Off-by-one range tables and missing digraphs silently corrupt text.

Width / normalization JS open

Long-vowel mark ー expands with the wrong vowel after katakana ヒ/ビ

normal-jp · birchill/normal-jp

Symptom

During normalization, the chōonpu (ー) is expanded with the wrong vowel after katakana ヒ and ビ.

Minimal repro

Normalize ヒー / ビー; the long-vowel expansion produces the wrong vowel for these two kana.

Fix

Fix the ー-expansion table entries for ヒ and ビ.

Fix PR → #normal-jp-choonpu-hi-bi

Width / normalization JS open

Full-width/kana conversion drops the first and last char of each range (moji)

moji · niwaringo/moji

Symptom

Full-width/half-width and kana range conversions skip the boundary characters of each range (！～ぁゖ ...), so edge code points like ！ (U+FF01) are not converted.

Minimal repro

moji('！～').convert('ZE', 'HE').toString()  // boundary chars at range edges are skipped

Fix

Convert the first and last character of each range, not just the interior.

Fix PR → #moji-range-boundaries

Width / normalization Rust open

Meilisearch charabia mis-detects half-width katakana script (Japanese search)

charabia · meilisearch/charabia

Symptom

Meilisearch's charabia tokenizer incorrectly classifies halfwidth katakana (U+FF65-U+FF9F) and some fullwidth forms as wrong scripts, causing them to be processed by the wrong tokenizer and failing Japanese search.

Minimal repro

Index halfwidth katakana text in Meilisearch; search for the same terms; results missing because script detection classifies halfwidth katakana as non-Japanese.

Fix

Extend script detection to recognize U+FF65–U+FF9F (halfwidth katakana) as Japanese script.

Fix PR → #charabia-halfwidth-katakana-script

Width / normalization Rust open

tabled splits combining marks from their base grapheme when wrapping width

tabled · zhiburt/tabled

Symptom

tabled's text-wrapping splits combining marks (diacritics, dakuten, etc.) away from their base grapheme when calculating width for terminal table cells.

Minimal repro

Create a tabled table with text containing combining marks (e.g., 'が' as 'か' + combining voiced mark U+3099); wrapping separates the base and combining mark onto different lines.

Fix

Use a grapheme cluster iterator (Unicode UAX #29) when wrapping, never splitting within a grapheme cluster.

Fix PR → #tabled-combining-mark-wrap

Width / normalization Rust open

Zed block cursor is misaligned over ambiguous-width Unicode characters

zed · zed-industries/zed

Symptom

In Zed editor, the block cursor over Unicode ambiguous-width characters (East Asian Width 'A' category, e.g., some symbols, box-drawing) is misaligned — the cursor glyph is not centered in the cell.

Minimal repro

Open Zed with a CJK font; position cursor on an ambiguous-width character; block cursor appears shifted or misaligned.

Fix

Center the block cursor glyph using the rendered cell width rather than the glyph's intrinsic width for ambiguous-width characters.

Fix PR → #zed-block-cursor-ambiguous-width

Surrogate & grapheme

11 entries open category page →

Code that walks text by UTF-16 code unit or bare code point instead of by grapheme cluster. Surrogate pairs and non-BMP characters get split, ZWJ emoji and variation selectors are mis-detected, and combining marks or conjunct clusters drift away from their base.

Surrogate & grapheme JS open

cli-table3 splits surrogate pairs (emoji / CJK) when truncating wide text

cli-table3 · cli-table/cli-table3

Symptom

cli-table3 truncates text by byte/code-unit count rather than code-point count, splitting surrogate pairs in emoji or supplementary CJK characters, producing mojibake in terminal table cells.

Minimal repro

Create a cli-table3 table with a column containing emoji (e.g., 🎉) or supplementary CJK characters; set a column width that truncates mid-emoji; output shows garbled characters.

Fix

Use a Unicode-aware splitter (spread operator or Array.from) to iterate code points rather than code units when truncating.

Fix PR → #cli-table3-surrogate-truncate

Surrogate & grapheme TS open

Clerk truncate splits surrogate pairs in emoji / non-BMP characters

clerk · clerk/javascript

Symptom

Clerk UI's truncateWithEndVisible function uses substring/slice on raw code units in its short-width fallback, splitting surrogate pairs in emoji or non-BMP characters.

Minimal repro

A Clerk UI component displaying an email/name containing emoji in the short-width fallback path; the truncated string ends mid-surrogate-pair, showing '?' or garbled chars.

Fix

Use Array.from() or spread to split by code points, or use Intl.Segmenter, before truncating.

Fix PR → #clerk-truncate-surrogate

Surrogate & grapheme JS open

opentype.js does not clamp cmap format 12/13 codes to U+10FFFF

opentype.js · opentypejs/opentype.js

Symptom

opentype.js does not clamp cmap format 12/13 character codes to U+10FFFF; malformed fonts with out-of-range codes cause incorrect glyph lookups for supplementary characters.

Minimal repro

Load a font with a cmap subtable containing entries beyond U+10FFFF; glyph lookup for supplementary characters (emoji, CJK Extension B+) returns wrong glyph.

Fix

Clamp all format 12/13 startCharCode/endCharCode values to 0x10FFFF during parsing.

Fix PR → #opentype-cmap-clamp

Surrogate & grapheme JS closed

markdown-it smart quotes break around non-BMP (U+10000+) characters

markdown-it · markdown-it/markdown-it

Symptom

markdown-it's smart quotes replacement does not handle non-BMP punctuation and symbols (U+10000+); surrounding text with supplementary characters causes wrong quote pairing or no conversion.

Minimal repro

markdown-it smartquotes on text adjacent to emoji or supplementary Unicode symbols (e.g., '𝕳ello'); smart quote pairing is incorrect.

Fix

Use a regex that is Unicode-aware for the 'whitespace' and 'punctuation' character class checks, or use Array.from for code-point iteration.

Closed PR → #markdown-it-nonbmp-smartquotes

Surrogate & grapheme JS open

Slate splits Indic conjunct clusters (UAX #29 GB9c) across graphemes

slate · ianstormtaylor/slate

Symptom

Slate rich-text editor does not implement Unicode UAX #29 GB9c rule, splitting Indic conjunct clusters (consonant + virama + consonant sequences) across grapheme boundaries, causing incorrect cursor positioning and deletion in Hindi, Bengali, Tamil, etc.

Minimal repro

Type a conjunct consonant in Hindi (e.g., 'क्ष') in Slate; press Backspace; only one codepoint is deleted instead of the full cluster.

Fix

Apply Unicode GB9c rule: treat <Indic_Conjunct_Break=Linker> sequences as a single grapheme cluster.

Fix PR → #slate-indic-conjunct-grapheme

Surrogate & grapheme merged

Combining marks/ZWJ wrongly treated as punctuation in emphasis (wenmode)

wenmode · lepture/wenmode

Symptom

wenmode's is_punctuation function treats Unicode combining marks (Mn category) and format characters (Cf/ZWJ) as punctuation per the CommonMark spec, incorrectly suppressing valid emphasis around CJK text with diacritics or ZWJ sequences.

Minimal repro

Markdown text using emphasis (* or _) adjacent to a combining mark or ZWJ character; emphasis fails to render because combining marks are classified as punctuation.

Fix

Exclude Unicode General Categories Mn, Mc, Cf from the punctuation classification; add ASCII fast-path to avoid performance regression.

Merged PR → #wenmode-emphasis-flanking-marks

Surrogate & grapheme JS cited closed

grapheme-splitter breaks ZWJ emoji (flags, skin tones) into pieces

grapheme-splitter · orling/grapheme-splitter

Symptom

grapheme-splitter breaks ZWJ-joined emoji into parts instead of one grapheme cluster: the rainbow flag splits into its component glyphs, and skin-tone sequences come apart.

Minimal repro

new GraphemeSplitter().splitGraphemes('🏳️‍🌈') returns two elements instead of one.

Fix

Implement the Unicode emoji ZWJ sequence rules (UTS #51) so a ZWJ-joined emoji stays a single cluster.

Upstream issue → #grapheme-splitter-zwj-emoji

Surrogate & grapheme JS cited closed

lodash _.toArray splits a tag-sequence flag emoji into code points

lodash · lodash/lodash

Symptom

_.toArray splits an emoji built from a tag sequence (a subdivision flag) into its component code points instead of returning it as one element.

Minimal repro

_.toArray for the England flag emoji (a base flag plus tag characters) returns seven pieces instead of one.

Fix

Use a Unicode-aware iterator such as Intl.Segmenter that handles tag sequences when converting a string to an array.

Upstream issue → #lodash-toarray-tag-sequence

Surrogate & grapheme Windows Terminal cited closed

Windows Terminal ignores VS15 (U+FE0E) and forces the emoji style

microsoft/terminal · microsoft/terminal

Symptom

Windows Terminal ignores the text-presentation variation selector U+FE0E, rendering the color-emoji form even when text style is explicitly requested.

Minimal repro

Print a text-style sequence such as U+23CF followed by U+FE0E; it renders as a color emoji instead of the text glyph.

Fix

Honor VS-15 (U+FE0E) for text presentation and VS-16 (U+FE0F) for emoji presentation, per the Unicode emoji variation sequences.

Upstream issue → #windows-terminal-vs15

Surrogate & grapheme JS cited open

emoji-regex matches a text-presentation char followed by U+FE0E (VS15)

emoji-regex · mathiasbynens/emoji-regex

Symptom

emoji-regex matches a base character even when it is followed by U+FE0E (the text variation selector), so text-presentation characters are wrongly classified as emoji.

Minimal repro

emojiRegex().test('\u2757\uFE0E') returns true even though U+FE0E requests text presentation.

Fix

Exclude a match followed by VS-15 (U+FE0E); treat only a trailing VS-16 (U+FE0F) or no selector as emoji.

Upstream issue → #emoji-regex-text-vs15

Surrogate & grapheme TS open

kaplay styled-text styles desync after emoji / astral CJK (grapheme vs UTF-16)

kaplay · kaplayjs/kaplay

Symptom

compileStyledText builds charStyleMap keyed by UTF-16 code-unit length, but formatText later applies the styles by grapheme index (via runes()). The two indexings match for ASCII, but drift apart after any character longer than one code unit: an emoji, a ZWJ sequence, or an astral-plane CJK ideograph (CJK Extension B, e.g. names written with 𠮷). Every style after such a character lands on the wrong grapheme or is dropped.

Minimal repro

In styled text, "😀[c]x[/c]" keys the colour style at code unit 2, but runes("😀x") puts x at grapheme index 1, so the style is lost.

Fix

Make compileStyledText walk grapheme clusters with the same runes() helper formatText already uses, keying charStyleMap by grapheme index. Normalize the input to NFC up front and consume a whole grapheme per escape so the slice lengths stay consistent.

Fix PR → #kaplay-styled-text-grapheme

Segmentation / word count

2 entries open category page →

Word counts and reading-time estimates that split on spaces. CJK scripts put no spaces between words, so an entire Japanese or Chinese paragraph counts as one word: content gates reject valid answers and "min read" labels read as 1. The fix counts CJK characters separately from space-delimited words.

Segmentation / word count Python open

split() word count treats a spaceless CJK answer as one word (omi)

omi · BasedHardware/omi

Symptom

Onboarding decides whether a spoken answer has enough content with len(transcript.split()) >= 2. str.split() returns 1 for CJK text that has no spaces, so a full answer like 東京に住んでいます is counted as a single word, never reaches the LLM check, and the question stays marked unanswered for Japanese, Chinese, and Korean speakers.

Minimal repro

len('東京に住んでいます'.split())  # 1, so word_count >= 2 is False and the answer is rejected

Fix

Use the existing CJK-aware _word_count helper (already used by should_discard_conversation) instead of plain split(); it falls back to split() for non-CJK text, so English input is unchanged.

Fix PR → #omi-onboarding-cjk-wordcount

Segmentation / word count TipTap open

TipTap word count treats a whole CJK paragraph as 1 word / 1 min read

emdash · emdash-cms/emdash

Symptom

The editor footer's word count and reading time come from TipTap's CharacterCount, whose default wordCounter splits on spaces. CJK scripts have no spaces between words, so a long Japanese or Chinese draft shows "1 word" and "1 min read" even though the published page, which already counts CJK separately, reports the correct time.

Minimal repro

Type a CJK paragraph into the editor; CharacterCount's default text.split(' ') returns 1, so the footer reports 1 word and 1 min read.

Fix

Configure CharacterCount with a wordCounter that counts CJK characters individually, and derive reading time from the text using the same word/CJK split and rates (200 words/min, 500 CJK chars/min) as the published reading-time util, so the editor and the rendered page agree.

Fix PR → #emdash-readingtime-cjk-wordcount

Numerals

2 entries open category page →

Kanji numerals, including the daiji (traditional) forms used in legal and financial documents.

Numerals Python open

kanji2number cannot parse 萬, the daiji form of 万 (Kanjize)

Kanjize · nagataaaas/Kanjize

Symptom

kanji2number cannot parse 萬, the daiji (大字) traditional form of 万 (10,000), so legal and financial documents that use 大字 numerals fail to convert.

Minimal repro

from kanjize import kanji2number
kanji2number('萬')  # not recognized as 10000

Fix

Map 萬 to 万 (10,000) in kanji2number.

Fix PR → #kanjize-daiji-man

Numerals JS open

formatjs relativetimeformat ignores numberingSystem (always Latin digits)

formatjs · formatjs/formatjs

Symptom

formatjs intl-relativetimeformat ignores the numberingSystem locale option (e.g., 'jpan', 'arab'), always producing Latin numerals in relative time strings.

Minimal repro

new Intl.RelativeTimeFormat('ja-JP-u-nu-jpan', {}).format(-1, 'day') returns '1日前' with Latin '1' instead of Japanese numeral '一日前'.

Fix

Apply the numberingSystem extension from the locale tag when formatting relative time numbers.

Fix PR → #formatjs-numbering-system

Locale data

24 entries open category page →

Missing or wrong locale data: untranslated placeholders, mistranslations that flip meaning, labels left in English, and parse tables that drop a locale's diacritics so a formatted month will not parse back.

Locale data React merged

Wrong Japanese expand/collapse labels in Ant Design Typography

ant-design · ant-design/ant-design

Symptom

The ja-JP labels for Typography's expand/collapse control were incorrect, so Japanese users saw the wrong 展開/折りたたみ text.

Minimal repro

1. Set ConfigProvider locale to ja_JP.
2. Render <Typography.Paragraph ellipsis={{ expandable: true }}>.
3. The expand/collapse label is wrong.

Fix

Correct the ja-JP expand/collapse label strings.

Merged PR → #ant-design-typography-ja-labels

Locale data Vue merged

naive-ui week date-picker placeholder left untranslated in Japanese

naive-ui · tusen-ai/naive-ui

Symptom

weekPlaceholder was missing from the Japanese locale, so the week date-picker showed an English placeholder.

Minimal repro

Use the date picker in week mode with the ja locale; the placeholder renders in English.

Fix

Add the Japanese weekPlaceholder translation.

Merged PR → #naive-ui-week-placeholder-ja

Locale data JS merged

timeago.js Japanese future times say 以内 (within) instead of 後 (later)

timeago.js · hustcc/timeago.js

Symptom

Future timestamps in the ja locale used 以内 (within) instead of 後 (later), so '3 minutes from now' rendered as 3分以内 (within 3 minutes), which means the opposite.

Minimal repro

timeago a future timestamp with the ja locale; it renders '3分以内' instead of '3分後'.

Fix

Use 後 for future time strings in the ja locale.

Merged PR → #timeago-ja-future-go

Locale data JS merged

cronstrue Japanese day-of-month step description is mistranslated

cronstrue · bradymholt/cRonstrue

Symptom

The Japanese description for a day-of-month step was not scoped to a month, producing a mistranslated cron description.

Minimal repro

cronstrue.toString('0 0 1-31/3 * *', { locale: 'ja' }); the description is not scoped to the month.

Fix

Scope the ja day-of-month step description to a month.

Merged PR → #cronstrue-ja-day-of-month-step

Locale data i18n merged

PrimeLocale Japanese: filterConstraint mistranslated 成約 → 制約

primelocale · primefaces/primelocale

Symptom

PrimeLocale Japanese (ja) locale uses 成約 (meaning 'conclusion of a contract') for 'filterConstraint' aria label, which should be 制約 ('constraint/restriction').

Minimal repro

PrimeFaces component with aria-label for filter constraint in Japanese reads '成約' (contract) instead of '制約' (constraint).

Fix

Change filterConstraint value from '成約' to '制約' in the ja.json locale file.

Merged PR → #primelocale-ja-filterconstraint

Locale data Vue open

Vant Japanese (ja-JP) locale translation errors

vant · youzan/vant

Symptom

Vant component library Japanese (ja-JP) locale contains incorrect translations for multiple UI strings.

Minimal repro

Vant components with ja-JP locale display mistranslated strings for picker, datetime, and other components.

Fix

Correct multiple mistranslated strings in vant/src/locale/lang/ja-JP.ts.

Fix PR → #vant-ja-jp-locale

Locale data React open

MUI X Japanese (ja-JP) locale is incomplete / has wrong translations

mui-x · mui/mui-x

Symptom

MUI X Japanese (ja-JP) locale has incomplete or incorrect translations for data grid, date picker, and time picker components.

Minimal repro

MUI X components with jaJP locale show untranslated English strings or incorrect Japanese.

Fix

Improve Japanese translations across multiple MUI X locale strings.

Fix PR → #mui-x-ja-jp-locale

Locale data React open

Semi Design is missing Japanese text in the Upload crop modal

semi-design · DouyinFE/semi-design

Symptom

Semi Design's Japanese locale is missing translation strings for the Upload component's crop modal UI.

Minimal repro

Semi Design Upload component with crop modal in Japanese locale shows untranslated (English) strings in the crop modal.

Fix

Add missing Japanese translation keys for Upload crop modal to ja-JP locale.

Fix PR → #semi-design-ja-upload-crop

Locale data Angular open

NG-ZORRO Japanese (ja_JP) missing quarter placeholders break the date picker

ng-zorro-antd · NG-ZORRO/ng-zorro-antd

Symptom

NG-ZORRO Angular Japanese (ja_JP) locale is missing quarter placeholder strings, causing runtime errors or blank UI for date range pickers.

Minimal repro

NG-ZORRO DatePicker quarter mode with ja_JP locale throws error or shows blank quarter placeholders.

Fix

Add missing quarter i18n keys to ja_JP locale file.

Fix PR → #ng-zorro-ja-quarter-placeholders

Locale data JS open

FilePond Japanese label: 読込中 should be アップロード中 (uploading)

filepond · pqina/filepond

Symptom

FilePond Japanese locale uses '読込中' (loading/reading) for the file processing label, which should be 'アップロード中' (uploading) to accurately describe the action.

Minimal repro

FilePond with Japanese locale shows '読込中' during file upload, misleading users into thinking a read is occurring.

Fix

Change labelFileProcessing from '読込中' to 'アップロード中' in the Japanese locale.

Fix PR → #filepond-ja-file-processing-label

Locale data React open

Arco Design Japanese (ja-JP) locale translation errors

arco-design · arco-design/arco-design

Symptom

Arco Design Japanese (ja-JP) locale contains multiple translation errors across component strings.

Minimal repro

Arco Design components with ja-JP locale show incorrect Japanese translations.

Fix

Correct translation errors in arco-design/arco-design/components/_utils/locale/ja-JP.ts.

Fix PR → #arco-design-ja-jp-locale

Locale data JS open

jsoneditor Japanese locale is incomplete (strings fall back to English)

jsoneditor · josdejong/jsoneditor

Symptom

jsoneditor's Japanese locale is incomplete; many UI strings remain in English when ja locale is selected.

Minimal repro

jsoneditor with locale:'ja' shows English strings for missing Japanese translations.

Fix

Fill all missing Japanese translation keys in the ja locale file.

Fix PR → #jsoneditor-ja-incomplete

Locale data Vue open

Quasar Japanese locale mistranslations in editor/tree labels

quasar · quasarframework/quasar

Symptom

Quasar Framework Japanese locale has mistranslations in editor toolbar labels and tree component strings.

Minimal repro

Quasar editor toolbar or tree component in Japanese shows incorrect label text.

Fix

Correct mistranslated Japanese strings in quasar/lang/ja.js.

Fix PR → #quasar-ja-editor-tree-labels

Locale data JS open

Uppy Japanese folderAdded smart_count plural placeholder is broken

uppy · transloadit/uppy

Symptom

Uppy's Japanese (ja_JP) locale has a broken smart_count plural placeholder in the 'folderAdded' string, causing pluralization to fail and show a raw placeholder.

Minimal repro

Uppy file picker in Japanese displays 'folderAdded' with a visible placeholder token instead of correct plural text.

Fix

Fix the smart_count placeholder syntax in ja_JP's folderAdded locale string.

Fix PR → #uppy-ja-smartcount-folderadded

Locale data JS open

Video.js is missing the Japanese label for Picture-in-Picture

video.js · videojs/video.js

Symptom

Video.js Japanese (ja) locale is missing the translation for the 'Playing in Picture-in-Picture' accessibility string, falling back to English.

Minimal repro

Video.js in Japanese locale; PiP mode accessible label shows 'Playing in Picture-in-Picture' in English.

Fix

Add 'ピクチャーインピクチャーで再生中' or equivalent to the ja.json locale.

Fix PR → #videojs-ja-pip-label

Locale data React merged

Memos Japanese locale: 24 missing translation keys (fall back to English)

memos · usememos/memos

Symptom

Memos Japanese (ja) locale has 24 missing translation keys; these fall back to English in the UI.

Minimal repro

Memos with Japanese locale selected; 24 UI strings display in English.

Fix

Fill all 24 missing Japanese translation keys in locales/ja.json.

Merged PR → #memos-ja-missing-keys

Locale data i18n open

Appwrite Japanese emails: 48 missing translation keys (sent in English)

appwrite · appwrite/appwrite

Symptom

Appwrite Japanese email templates have 48 missing translation keys, causing emails to be sent in English to Japanese users.

Minimal repro

Appwrite with Japanese locale; email notifications contain 48 English fallback strings.

Fix

Translate all 48 missing keys in app/config/locale/translations/ja.json email templates.

Fix PR → #appwrite-ja-email-keys

Locale data React merged

Medusa Japanese dashboard: 511 missing translation keys

medusa · medusajs/medusa

Symptom

Medusa dashboard Japanese (ja) locale has 511 missing translation keys, leaving most of the admin UI in English for Japanese users.

Minimal repro

Medusa admin dashboard with Japanese locale; majority of UI strings display in English.

Fix

Translate all 511 missing keys in packages/admin-ui/ui/src/i18n/translations/ja.json.

Merged PR → #medusa-ja-dashboard-keys

Locale data JS open

jp-prefectures.js: Aichi (愛知県) English name wrongly set to 'ehime'

jp-prefectures.js · hatsu38/jp-prefectures.js

Symptom

jp-prefectures.js has the English name of Aichi prefecture (愛知県) set to 'ehime' (which is Ehime prefecture / 愛媛県), causing incorrect prefecture mapping.

Minimal repro

jpPrefectures.findByCode(23).enName returns 'ehime' instead of 'aichi'.

Fix

Change enName for 愛知県 (code 23) from 'ehime' to 'aichi'.

Fix PR → #jp-prefectures-aichi-enname

Locale data JS open

date-fns Galician formats June as xuño but cannot parse it back

date-fns · date-fns/date-fns

Symptom

In the gl (Galician) locale the June parse pattern is /^xun/i. It matches the abbreviation "xun" but not the wide form "xuño", because the third character is ñ, not n. So format then parse round-trips fail for June; the locale's own snapshot already records Invalid Date for June while the other eleven months parse.

Minimal repro

const s = format(new Date(2021, 5, 1), 'MMMM', { locale: gl }); // 'xuño'
parse(s, 'MMMM', new Date(), { locale: gl });               // Invalid Date

Fix

Widen the June pattern to /^xu[nñ]/i so it matches both "xun" and "xuño" while staying distinct from July ("xul"), mirroring locales such as Catalan that already fold diacritics into their patterns (e.g. /^març/i).

Fix PR → #date-fns-gl-xuno-month-parse

Locale data React open

MUI Pagination aria-labels fall back to English in the zh-CN locale

material-ui · mui/material-ui

Symptom

The zh-CN locale is missing the MuiPagination block, so the pagination aria-label and getItemAriaLabel text fall back to English even though every other component in the file is localized. Among all locales only the three Chinese ones omit it; ja-JP and ko-KR already localize this block.

Minimal repro

Render a MUI <Pagination> under the zhCN locale and inspect it with a screen reader; the navigation aria-label and per-page item labels are announced in English.

Fix

Add the MuiPagination block to zhCN, reusing the noun phrases already in this file's MuiTablePagination.getItemAriaLabel (第一页 / 最后一页 / 下一页 / 上一页) plus 转到 ("Go to") to match the English source.

Fix PR → #material-ui-zhcn-pagination

Locale data JS open

Select2 Japanese locale is missing the removeItem / search ARIA labels

select2 · select2/select2

Symptom

The Japanese (ja) locale is missing the removeItem and search keys, so Japanese users get the English fallback for two ARIA labels: the per-item remove button (used in selection/multiple.js) and the search field (used in selection/search.js and dropdown/search.js). Both are part of the canonical set in en.js.

Minimal repro

Use a multi-select Select2 under the ja locale with a screen reader; the remove-item button and the search field announce their English fallback labels.

Fix

Add removeItem (アイテムを削除) and search (検索) to the ja locale, following the existing removeAllItems wording (すべてのアイテムを削除).

Fix PR → #select2-ja-removeitem-search

Locale data Vue open

naive-ui weekPlaceholder untranslated in Korean and Traditional Chinese (Select Week)

naive-ui · tusen-ai/naive-ui

Symptom

weekPlaceholder was left as the English fallback "Select Week" in the Korean (koKR) and Traditional Chinese (zhTW) DatePicker locales, so the week-mode date picker showed an English placeholder. zhCN and jaJP were already translated; these were the remaining CJK locales still showing the fallback.

Minimal repro

Use the date picker in week mode under the koKR or zhTW locale; the placeholder renders "Select Week" in English.

Fix

Translate weekPlaceholder to 주 선택 (koKR, following the existing 월 선택 / 년 선택 pattern) and 選擇週 (zhTW, the Traditional form of zhCN's 选择周), propagating the jaJP fix from #8114 to the remaining CJK locales.

Fix PR → #naive-ui-week-placeholder-cjk

Unicode range

1 entry open category page →

Character-class ranges that drift out of sync, so valid non-ASCII letters fall just outside the accepted set. An alpha rule accepts a letter that the matching alphanumeric rule rejects, and common accented characters fail validation even though their unaccented neighbours pass.

Unicode range JS open

validator.js isAlphanumeric el-GR rejects accented Greek that isAlpha accepts

validator.js · validatorjs/validator.js

Symptom

isAlpha('el-GR') uses the Greek range [Α-ώ] (U+0391–U+03CE) but isAlphanumeric('el-GR') still ends at ω ([0-9Α-ω], U+03C9). So isAlphanumeric rejects ό, ύ, ώ and the uppercase Ό/Ύ/Ώ even though isAlpha accepts them, and common words like νερό or πρώτα pass isAlpha but fail isAlphanumeric.

Minimal repro

isAlpha('νερό', 'el-GR')        // true
isAlphanumeric('νερό', 'el-GR') // false  <- ό (U+03CC) sits past the range end ω

Fix

Bump the alphanumeric Greek range to [0-9Α-ώ] so it matches the alpha class.

Fix PR → #validatorjs-el-gr-alphanumeric-range

Regex roundtrip

1 entry open category page →

Parse to AST and back. A character that is special in one position (a leading caret, a hyphen) can be emitted unescaped and change what the pattern matches.

Regex roundtrip JS open

regexp-tree emits a leading ^ unescaped, turning [a^] into [^a]

regexp-tree · DmitrySoshnikov/regexp-tree

Symptom

Optimizing or regenerating a character class can move a literal ^ to the front and emit it unescaped, flipping the meaning: [a^] round-trips to [^a] (a negated class).

Minimal repro

const rt = require('regexp-tree')
rt.optimize('/[a^]/').toString() // '/[^a]/'  <- now matches the negation

Fix

In the generator, escape a leading literal ^ in a non-negative character class so generate(parse(x)) preserves meaning.

Fix PR → #regexp-tree-leading-caret

Codegen escape

2 entries open category page →

Generators that interpolate text without escaping it. Special characters (including CJK and non-ASCII identifiers) leak into the output and break it.

Codegen escape TS open

json-schema-to-typescript: enum names with special chars produce invalid TypeScript

json-schema-to-typescript · bcherny/json-schema-to-typescript

Symptom

Enum member names containing special characters (including non-ASCII / CJK) are emitted unescaped, so the generated enum does not compile.

Minimal repro

Generate types from a schema whose enum values contain quotes or special characters; the emitted enum has invalid member identifiers and tsc errors.

Fix

Escape enum member names so special characters produce valid output.

Fix PR → #json-schema-to-typescript-enum-escape

Codegen escape JS open

Markdoc formatter over-escapes a mid-line # (C# becomes C\#)

markdoc · markdoc/markdoc

Symptom

The formatter over-escapes a # in the middle of a line because the heading branch of the escape regex is not anchored to line start: 'C# is a language' becomes 'C\# is a language'.

Minimal repro

format(parse('C# is a language')) // 'C\# is a language' - the '#' is escaped mid-sentence

Fix

Anchor the heading alternative: change #+\s to ^#+\s, mirroring the already-anchored ^* and ^> cases.

Fix PR → #markdoc-mid-line-hash-escape

Encoding & BOM

1 entry open category page →

Byte-order marks and charset edges. One code path strips a leading U+FEFF and a sibling does not, so the first field name or CSV header key arrives with an invisible BOM glued to it and lookups by that key silently miss.

Encoding & BOM DenoTS cited open

Deno @std/csv CsvParseStream leaves a BOM glued to the first header key

@std/csv · denoland/std

Symptom

The synchronous parse() strips a leading UTF-8 byte-order mark (U+FEFF) but CsvParseStream does not. When a CSV begins with a BOM, common output from Excel and other Windows tools, the first field name arrives as "\uFEFFname" instead of "name", silently corrupting header-based lookups.

Minimal repro

const src = ReadableStream.from(['\uFEFFname,age\n', 'Alice,34\n']);
await Array.fromAsync(src.pipeThrough(new CsvParseStream({ skipFirstRow: true })));
// [{ '\uFEFFname': 'Alice', age: '34' }]  <- BOM leaked into the key

Fix

Strip the BOM from the first line read by StreamLineReader, matching what parse() already does via its BYTE_ORDER_MARK constant.

Upstream issue → #deno-std-csv-bom-stream

No entries match. Try a shorter query.