Re: Definition of character: Exegesis of SC2 nomenclature

From: Martin Kochanski (unicode@cardbox.net)
Date: Wed Jul 10 2002 - 19:33:21 EDT


At 16:16 09/07/02 -0700, Kenneth Whistler wrote:
>
>The *reason* why SC2 chose such a strange and seemingly open-ended
>definition was *not* to invite arbitrarily strange collections of
>data control elements to be encoded as characters, but rather an
>attempt, in a procrustean way, to get the definition to fit the
>reality.

No, this is not Procrustean, it is Lesbian. Procrustes made you fit his bed by stretching you or cutting your feet off; it was the Lesbians who measured straight lines with a rule made of lead so that if the line wavered they could bend the rule to fit the line.

I mention this because Unicode is the opposite of Procrustean. If the members of this mailing list had any sense, they would rejoice in this fact daily. The rest of the world grows ever more rigid and coerces reality into ever stranger shapes - fill in any form and you'll see - but Unicode goes on moulding itself to fit every oddness and protuberance of human writing systems. We could have expected that computers would impose a dreadful ASCII uniformity on everyone - the people would grumble but eventually knuckle under - but in fact the opposite has happened.

There is no finer antidote to gloom and cynicism than leafing through the Unicode Standard. One can celebrate the endless variety of alphabets and the sheer beauty of some of their characters (my current favourite is Sinhala U+0DA3) - but also, and even more, the sheer scholarly effort of codifying some of the scripts (the epic of Tibetan, the still only half reinvented Mongolian).

All over the world, crumbling scholars have lifted their heads from crumbling tomes long enough to codify alphabets for languages that no-one alive now speaks and no-one may ever understand; and the decisions they have taken will last for as long as the world endures. That, too, is cause for celebration. In what other computing book could you find a phrase such as "In good Latvian typography"?

Next time you want to add some noise to the signal, have a poll for Funniest Cartoon Character (the runner at U+006F U+0F79), Warmest Character (togetherness U+1024), Most Needed Character (all computer users need U+02AD), and Character Most Resembling a Frog (this is left as an exercise for the reader). You could also try Ugliest Alphabet; but perhaps that wouldn't be tactful.

Martin Kochanski.



This archive was generated by hypermail 2.1.2 : Wed Jul 10 2002 - 18:01:19 EDT