On Tue, 1 Jul 1997, Markus Kuhn wrote:
> I see already with horror today, that programming language standards
> allow *all* ISO 10646 characters in identifier names. Imagine variable
> names with bidi and combining character content: interoperability
> will be doomed, I will not even be able to print some of the
> procedures. Are linkers supposed to resolve precomposed and
> combining characters in identifiers?
This is definitely a problem. I'm working on an internet-draft
to define a set of normalizations and recommendations so that
for examlpe the precomposed/decomposed problem can be eliminated.
The main target of this is URLs, but ideally, it should be
adopted by all kinds of other identifiers. I hope I can send
out the first version of that draft soon, but I'm really buisy
> Few people who just naively
> reference ISO 10646 in their specification have a real clue of
> what problems they might create.
Yes, this is a problem. On the other hand, there are people
that see lots of problems with ISO 10646 where there are none
or not that many, or they are at a different place. For an
example, see the recent language tagging discussion.
> ISO 15646¹ would be a standard
> that can be used the same way as ASCII without creating these
> additional semantic interoperability hazards that ISO 10646
> promises today.
No. For those applications you intend to use it (fixed-cell
display systems), there are several other scripts that are
suited (e.g. Georgian, Armenian,...). With only a very small
extension (i.e. having both single-width and double-width
display cells), you can integrated all of CJK. Mule, which
is unfortunately still too much hooked to a fixed-cell display
system, even managed to do Arabic and Devanagari, in readable
quality. Every limitation like the one you propose will lead
to cutting off such things even where they could easily be done.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT