Re: Official ISO 3166 country codes online

From: John Cowan (jcowan@reutershealth.com)
Date: Thu Dec 02 1999 - 17:26:10 EST


Disclaimer: I am not Chinese.

Mark Crispin wrote:

> I know that Cantonese is written differently than Mandarin, but I'm not sure
> how many of the other dialects have unique written languages.

It's not a matter of unique written languages, it's a matter of
a small number of language-specific non-standard characters.
These come up mostly in places like transcriptions of song lyrics:
most written texts are in Standard Chinese (Mandarin) only.
 
> It may also depend upon the script. For example, maybe a Chinese dialect is
> identical to Mandarin using hanzi, but is different in romanization.

For sure.

> Certainly "zh-cn" vs. "zh-tw" is commonly used to distinguish between
> simplified and traditional; an obvious misuse but a convenient one. I guess
> though that with Unicode, this crutch isn't needed since I doubt that there's
> much (if any) ambiguity between simplified and traditional in Unicode text.

There is no unification of simplified characters with traditional ones.
Of course, if a character has no simplified form, it has only one Unicode codepoint.

> Is it the case that this document, in the absence of language tags,
> can be converted to Unicode unambiguously and display identically?

Yes.

> Can the
> document, in the absense of language tags, be converted back to ISO-2022-CN
> and still display identically (I'm not saying that file would be the same)?

Yes.

-- 

Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT