Tim Partridge wrote:
> I agree with his point of view that the tags
> should be at the character level and not just
> in the UTF-8 format.
> How about using Escape sequences?

Ugh. The relatively few escape sequences at the character level is what
makes Unicode so ATTRACTIVE, esp. to those that currently use escape
sequence based character sets. (Tools to repair broken escape codes in
JIS are almost standard equipment with most Japanese computer systems)
Not to mention the complexity they add to simple and elegant string
manipulation functions... processing escape codes can sometimes bump the
algorithm efficiency up by one O() level.

Put in escape codes at the character level, and Unicode begins to lose
the simplicity factor, and becomes just another mammoth character set
that nobody can or will implement--there are plenty out there. If I
wanted escape sequences, I could choose from a lot of other character
sets that are already out there.

If you want a complicated character system that does tags and
everything, there are plenty to choose from-- Unicode basher Prof. Ken
Sakamura (U. of Tokyo) and Co. would be more than happy to tout the
virtues of TRON, which is loaded with escape sequences galore. The TRON
project has made a religion out of bad-mouthing Unicode, much like the
computer industry has made a religion out of bad-mouthing a certain
software firm in Redmond, Washington (who make a darn fine Unicode based
OS, I might add). They have to-- they have to justify that the years of
blood, sweat, tears (and most importantly, money) they've used making
-their- worldwide standard character set has not repeated work that's
already here and in use and better.


Granted, Unicode is complicated. It will get more complicated. This is a
fact of life as representing languages is complicated. But I'd hope the
character level stays as simple as possible, for those that need

I do NOT agree that tags should be at the character level.

