Re: [unicode] CJK variation modifier

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon May 21 2007 - 16:42:35 CDT

  • Next message: Philippe Verdy: "RE: [unicode] CJK variation modifier"

    Garrit Sangel said:

    > I don't really have as much background knowledge as you have, I just made up
    > my mind about some problems which I encountered using plain text files. So I
    > can't really compete with your knowledge, I hope this won't be too bad. Maybe
    > just see this as a perspective from an ordinary user.

    Sure. All questions welcome. :-)

    mpsuzuki suggested:

    >
    > Am Montag 21 Mai 2007 11:50 schrieben Sie:
    > > I guess what you want had ever been proposed as
    > > "language tagging".
    > > http://unicode.org/faq/languagetagging.html
    > > http://www.unicode.org/reports/tr7/tr7-4.html
    > > It was obsoleted, because the language specification in
    > > plain Unicode text will conflict with higher level
    > > language specifications in XML, HTML etc. ISO-2022
    > > encoding may be better solution for such requiement.

    I concur with the Suzuki-san's assessment that language tag
    characters in Unicode plain text are not appropriate. In
    part that is because of the conflict with language tagging
    in structured text formats, but also because the language
    tag mechanism was intended from the first for short string
    tagging in limited protocoals, rather than being
    a replacement for general language tagging in structured text.

    I disagree, however, with the suggestion that ISO-2022
    encoding would be a better solution. ISO-2022 encoding is
    very little supported these days outside of some specific
    mail contexts, and is not a realistic alternative to use
    of Unicode for multilingual text.

    > Hm, yes, but, as I said, you don't always use XML oder HTML.

    And even if you are, language tagging is not always accurate,
    nor is it a failsafe guide to font choice in any case.

    > For example, if
    > you are using mixed characters in plain text files, you have a bit of a
    > problem to let the editor know, when to use which font.

    If you are making explicit font choices, you are already
    making use of structured text. And it seems to me that if
    an author has particular font choices in mind, they should
    just indicate them directly. Why would one think that embedding
    a hidden, invisible control code would do a better job of
    that than an explicit editing instruction of the sort:

    {Please set the second Chinese example in the following
    paragraph in XXXX traditional Chinese font, as I am making
    a point about difference in display from the YYYY Japanese
    font otherwise used for the Chinese examples.}

    And so on. As an editor myself, I know that I far prefer
    explicit, *visible* instructions to hidden tricks that
    tend not to work when moving from one platform to another.

    Also keep in mind John Jenkins' advice earlier in this thread.
    Typography in Japan or China tends not to switch back and forth
    between fonts when switching languages, except in very specialied
    works. Chinese names, places, phrases are printed with Japanese
    fonts when cited in Japanese text. Japanese names, places, phrases
    are printed with Chinese fonts when cited in Chinese text. Basically
    you don't *want* to be switching back and forth between faces
    in such cases, as it makes the typography look inconsistent
    and unpleasant.

    --Ken



    This archive was generated by hypermail 2.1.5 : Mon May 21 2007 - 16:45:36 CDT