Re: tips on writing character proposal

From: Christoph Pper <>
Date: Thu, 10 Nov 2011 08:50:09 +0100

Mark E. Shoulson:

> But Unicode isn't about encoding what would be neat to encode. It's about encoding _text_, (including things that have been encoded before).

It’s sometimes non-obvious, though, what one should consider as text and what one should not, e.g. mathematic formulae.

I would assume that usually the following indicated good candidates for encoding:

— A glyph composed with (La)TeX commands and frequently asked for in forums.
— 1em high inline pictures included in HTML on websites. CSS not so much.
— Letter-sized glyphs seen in print and manuscripts amongst common characters, e.g. the logographic heart in “I ♥ NY”.
— Character( sequence)s used for their glyphic appearance in short-messages (SMS, IM, IRC, …), tweets and status updates or the like.

Concerning the last category: does Unicode need to encode a character BUTTERFLY, because character sequences like ‘Ƹ̵̡Ӝ̵̨̄Ʒ’ (a Roman/IPA–Cyrillic mix) are quite common and popular in certain social groups?
Of course there had been inline (rotated) ASCII art before the advent of Unicode, most notably in the form of emoticons or smilies, but also for instance flowers ‘-<-@’ → ⚘ U+2698, hearts ‘<3’ → ♡/♥ U+2661/5 or scissors ‘8<’/‘>8’ → ✂ U+2702 etc.

Especially in comics, curse words are sometimes not written explicitly, but use a sequences of more or less arbitrary symbols instead. Should one encode this with existing symbolic caracters for their glyphic value, e.g. “$#*!”, or would it be better to have a logic character DISGUISED CURSE that might render like an asterisk or bullet point or a random sequence thereof, e.g. “f***”, or in an (almost) random manner select non-phonographic symbols, e.g. “f♨⚔⚡”, in smart fonts?
Received on Thu Nov 10 2011 - 01:55:27 CST

This archive was generated by hypermail 2.2.0 : Thu Nov 10 2011 - 01:55:29 CST