-
cli-table3 splits surrogate pairs (emoji / CJK) when truncating wide text
cli-table3 truncates text by byte/code-unit count rather than code-point count, splitting surrogate pairs in emoji or supplementary CJK characters, producing mojibake in terminal table cells.
-
Clerk truncate splits surrogate pairs in emoji / non-BMP characters
Clerk UI's truncateWithEndVisible function uses substring/slice on raw code units in its short-width fallback, splitting surrogate pairs in emoji or non-BMP characters.
-
opentype.js does not clamp cmap format 12/13 codes to U+10FFFF
opentype.js does not clamp cmap format 12/13 character codes to U+10FFFF; malformed fonts with out-of-range codes cause incorrect glyph lookups for supplementary characters.
-
markdown-it smart quotes break around non-BMP (U+10000+) characters
markdown-it's smart quotes replacement does not handle non-BMP punctuation and symbols (U+10000+); surrounding text with supplementary characters causes wrong quote pairing or no conversion.
-
Slate splits Indic conjunct clusters (UAX #29 GB9c) across graphemes
Slate rich-text editor does not implement Unicode UAX #29 GB9c rule, splitting Indic conjunct clusters (consonant + virama + consonant sequences) across grapheme boundaries, causing incorrect cursor positioning and deletion in Hindi, Bengali, Tamil, etc.
-
Combining marks/ZWJ wrongly treated as punctuation in emphasis (wenmode)
wenmode's is_punctuation function treats Unicode combining marks (Mn category) and format characters (Cf/ZWJ) as punctuation per the CommonMark spec, incorrectly suppressing valid emphasis around CJK text with diacritics or ZWJ sequences.
-
grapheme-splitter breaks ZWJ emoji (flags, skin tones) into pieces
grapheme-splitter breaks ZWJ-joined emoji into parts instead of one grapheme cluster: the rainbow flag splits into its component glyphs, and skin-tone sequences come apart.
-
lodash _.toArray splits a tag-sequence flag emoji into code points
_.toArray splits an emoji built from a tag sequence (a subdivision flag) into its component code points instead of returning it as one element.
-
Windows Terminal ignores VS15 (U+FE0E) and forces the emoji style
Windows Terminal ignores the text-presentation variation selector U+FE0E, rendering the color-emoji form even when text style is explicitly requested.
-
emoji-regex matches a text-presentation char followed by U+FE0E (VS15)
emoji-regex matches a base character even when it is followed by U+FE0E (the text variation selector), so text-presentation characters are wrongly classified as emoji.
-
kaplay styled-text styles desync after emoji / astral CJK (grapheme vs UTF-16)
compileStyledText builds charStyleMap keyed by UTF-16 code-unit length, but formatText later applies the styles by grapheme index (via runes()). The two indexings match for ASCII, but drift apart after any character longer than one code unit: an emoji, a ZWJ sequence, or an astral-plane CJK ideograph (CJK Extension B, e.g. names written with 𠮷). Every style after such a character lands on the wrong grapheme or is dropped.
Other categories