Re: Unicode in source code. WHY?

From: G. Adam Stanislav (adam@whizkidtech.net)
Date: Wed Jul 21 1999 - 16:34:29 EDT


On Wed, Jul 21, 1999 at 10:08:14AM -0700, Addison Phillips wrote:
> Clearly text editors (which means programming environments) should support
> all of Unicode.
>
> Identifiers are a thornier issue. Combining marks versus precomposed clearly
> presents a problem in this area.

Why is it a problem? As long as each identifier is represented by the same
sequence of bytes every time it is used, why should a compiler care whether
combining marks or precomposed characters were used? For all the compiler
needs to know, it is just a unique sequence of bytes.

It is also not necessary for all text editors to support all of Unicode.
I program with editors that run under a FreeBSD console. By the design of
the underlying hardware (the VGA), they are restricted to the maximum of
256 characters.

I happen to use my console in ISO-8859-2 mode. The editor does not know
that. When I type a Central European character on my keyboard, the VGA
displays it in the editor properly, even though the editor has no idea
what charset I am using. I can easily convert the file into Unicode,
or UTF-8, and back. It would be *nice* if the editor could support all
of Unicode, but the editor is fully useful for my programming needs
as is. The editor is doing the best it can given the limitations of the
environment it is running under.

Adam



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT