Re: [unicode] CJK variation modifier

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon May 21 2007 - 16:42:35 CDT

Next message: Philippe Verdy: "RE: [unicode] CJK variation modifier"

Previous message: Peter Constable: "RE: Order of Infrequent Combining Marks in Thai"
Maybe in reply to: mpsuzuki@hiroshima-u.ac.jp: "Re: [unicode] CJK variation modifier"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Garrit Sangel said:

> I don't really have as much background knowledge as you have, I just made up
> my mind about some problems which I encountered using plain text files. So I
> can't really compete with your knowledge, I hope this won't be too bad. Maybe
> just see this as a perspective from an ordinary user.

Sure. All questions welcome. :-)

mpsuzuki suggested:

>
> Am Montag 21 Mai 2007 11:50 schrieben Sie:
> > I guess what you want had ever been proposed as
> > "language tagging".
> > http://unicode.org/faq/languagetagging.html
> > http://www.unicode.org/reports/tr7/tr7-4.html
> > It was obsoleted, because the language specification in
> > plain Unicode text will conflict with higher level
> > language specifications in XML, HTML etc. ISO-2022
> > encoding may be better solution for such requiement.

I concur with the Suzuki-san's assessment that language tag
characters in Unicode plain text are not appropriate. In
part that is because of the conflict with language tagging
in structured text formats, but also because the language
tag mechanism was intended from the first for short string
tagging in limited protocoals, rather than being
a replacement for general language tagging in structured text.

I disagree, however, with the suggestion that ISO-2022
encoding would be a better solution. ISO-2022 encoding is
very little supported these days outside of some specific
mail contexts, and is not a realistic alternative to use
of Unicode for multilingual text.

> Hm, yes, but, as I said, you don't always use XML oder HTML.

And even if you are, language tagging is not always accurate,
nor is it a failsafe guide to font choice in any case.

> For example, if
> you are using mixed characters in plain text files, you have a bit of a
> problem to let the editor know, when to use which font.

If you are making explicit font choices, you are already
making use of structured text. And it seems to me that if
an author has particular font choices in mind, they should
just indicate them directly. Why would one think that embedding
a hidden, invisible control code would do a better job of
that than an explicit editing instruction of the sort:

{Please set the second Chinese example in the following
paragraph in XXXX traditional Chinese font, as I am making
a point about difference in display from the YYYY Japanese
font otherwise used for the Chinese examples.}

And so on. As an editor myself, I know that I far prefer
explicit, *visible* instructions to hidden tricks that
tend not to work when moving from one platform to another.

Also keep in mind John Jenkins' advice earlier in this thread.
Typography in Japan or China tends not to switch back and forth
between fonts when switching languages, except in very specialied
works. Chinese names, places, phrases are printed with Japanese
fonts when cited in Japanese text. Japanese names, places, phrases
are printed with Chinese fonts when cited in Chinese text. Basically
you don't *want* to be switching back and forth between faces
in such cases, as it makes the typography look inconsistent
and unpleasant.

--Ken

Next message: Philippe Verdy: "RE: [unicode] CJK variation modifier"
Previous message: Peter Constable: "RE: Order of Infrequent Combining Marks in Thai"
Maybe in reply to: mpsuzuki@hiroshima-u.ac.jp: "Re: [unicode] CJK variation modifier"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon May 21 2007 - 16:45:36 CDT