Re: ISO 15924: zh-Hani for general Chinese (was: Different Arabic scripts?)

From: Philippe Verdy (
Date: Sun Nov 27 2005 - 10:01:59 CST

  • Next message: Philippe Verdy: "Re: Representing Unix filenames in Unicode"

    From: "Tom Emerson" <>
    > Philippe Verdy writes:
    >> It is not written, it is however a Chinese standard, and the most likely
    >> to
    >> occur. It does not change my argument however, whichever romanization
    >> system
    >> is used, it is still a distinction from the Han (any script) writing
    >> system,
    >> and "Latn" indicates such romanization.
    > I agree with your core argument, but still disagree that any
    > particular romanization system can or should be inferred from the
    > language/script indication.

    Fundamentally yes, but really no: the romanization system is nearly never
    explicitly stated, but assumes that the readers acknowledge the existence of
    a widely used common system for a particular language-script combination.
    Such assumption is done everyday for Chinese written in Hant or Hans, or
    even for English written in Latin (when one deviates from the widely used
    orthography, to create a "phonetic"-based "simplified" system, this is badly
    perceived as if it were a distinct language and not the "standard" language.

    See for example the debates about the SMS-like abbreviations: even if they
    are often used on mobile phones and in chats on the net, this is mostly
    because of a technical constraints, and a large majority of users would be
    more happy if there existed a fast input method that allows them to compose
    or receive those abbreviations while still reading the normal language,
    something that voice recognition systems might help to improve on mobile
    phones were it's quite uneasy to compose words and punctuation with only 9
    keys, plus one to switch between various and really non-intuitive input
    modes (Such issue in English looks very similar to the issues that Chinese
    or Japanese users have to manage with, when trying to compose ideographs
    even on their extended PC keyboard).

    So there exists a standard, and in my opinion, the absence of specification
    of the romanization system in use means *most of the time* the standard
    system adapted to a target language, based on an agreed common set of Latin
    characters used in the target language.

    If the romanization system is used by native speakers to write their native
    languages in the source language (not in a translation), the romanization
    system converges rapidly to a common widely used standard form (see for
    example the rapid standardization of Azeri in Latin).

    Most romanization systems only diverge for *imports* of some words from a
    foreign language. There's no international standard here because the system
    results from a best match between the source and target language-script
    combinations, and not only from the source one.

    This archive was generated by hypermail 2.1.5 : Sun Nov 27 2005 - 10:05:59 CST