Re: Transcriptions of "Unicode"

From: Erik van der Poel (
Date: Wed Dec 06 2000 - 19:44:27 EST

James Kass wrote:
> Erik van der Poel wrote:
> >
> > The font selection is indeed somewhat haphazard for CJK when there are
> > no LANG attributes and the charset doesn't tell us anything either, but
> > then, what do you expect in that situation anyway? I suppose we could
> > deduce that the language is Japanese for Hiragana and Katakana, but what
> > should we do about ideographs? Don't tell me the browser has to start
> > guessing the language for those characters. I've had enough of the
> > guessing game. We have been doing it for charsets for years, and it has
> > led to trouble that we can't back out of now. I think we need to draw
> > the line here, and tell Web page authors to mark their pages with LANG
> > attributes or with particular fonts, preferrably in style sheets.
> A Universal Character Set should not require mark-up/tags.
> If the Japanese version of a Chinese character looks different
> than the Chinese character, it *is* different. In many cases,
> "variant" does not mean "same".

I was referring to the CJK Unified Ideagraphs in the range U+4E00 to
U+9FA5. I agree that those codes do not *require* mark-up/tags, but if
the author wishes to have them displayed with a "Japanese font", then
they must indicate the language or specify the font directly. The latter
may be problematic. I don't think it's reasonable to expect a browser to
apply various heuristics to determine the language.

> When limited to BMP code points, CJK unification kind of made
> sense. In light of the new additional planes...
> The IRG seems to be doing a fine job.

Somehow I get the impression that you have more to say, but you just
aren't saying it. Cough it up already. :-)


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT