Re: Official ISO 3166 country codes online

From: John Cowan (
Date: Thu Dec 02 1999 - 17:26:10 EST

Disclaimer: I am not Chinese.

Mark Crispin wrote:

> I know that Cantonese is written differently than Mandarin, but I'm not sure
> how many of the other dialects have unique written languages.

It's not a matter of unique written languages, it's a matter of
a small number of language-specific non-standard characters.
These come up mostly in places like transcriptions of song lyrics:
most written texts are in Standard Chinese (Mandarin) only.
> It may also depend upon the script. For example, maybe a Chinese dialect is
> identical to Mandarin using hanzi, but is different in romanization.

For sure.

> Certainly "zh-cn" vs. "zh-tw" is commonly used to distinguish between
> simplified and traditional; an obvious misuse but a convenient one. I guess
> though that with Unicode, this crutch isn't needed since I doubt that there's
> much (if any) ambiguity between simplified and traditional in Unicode text.

There is no unification of simplified characters with traditional ones.
Of course, if a character has no simplified form, it has only one Unicode codepoint.

> Is it the case that this document, in the absence of language tags,
> can be converted to Unicode unambiguously and display identically?


> Can the
> document, in the absense of language tags, be converted back to ISO-2022-CN
> and still display identically (I'm not saying that file would be the same)?



Schlingt dreifach einen Kreis vom dies! || John Cowan <> Schliesst euer Aug vor heiliger Schau, || Denn er genoss vom Honig-Tau, || Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT