Re: Digit/letter variants in the "same" unified script (was: stability policy on numeric type = decimal)

From: Mark Davis ☕ (mark@macchiato.com)
Date: Thu Jul 29 2010 - 13:19:28 CDT

  • Next message: CE Whitehead: "RE: Digit/letter variants in the "same" unified script (was: stability policy on numeric type = decimal)"

    That just really isn't a script issue; it is more an issue of which language
    orthographies use which characters, and we have provision for that
    information in CLDR.

    Mark

    *— Il meglio è l’inimico del bene —*

    On Thu, Jul 29, 2010 at 09:07, Philippe Verdy <verdy_p@wanadoo.fr> wrote:

    > "Mark Davis ☕" <mark@macchiato.com>
    > > It is not so strange. Read
    > > http://www.unicode.org/reports/tr24/proposed.html#Multiple_Script_Values
    > ,
    > > and other parts of #24 describing Common.
    >
    > It is exactly because I had read this proposed update for UTS#24 that
    > I used my argument (if not, I would have not spoken about the
    > ExtendedScript property in my report : isn't it made to use more
    > precise mappings to ISO 15924, including script variants ?).
    >
    > Nothing would be special about "Common" : "sc=Arabic" alias "sc=Arab"
    > could use the same formalism (also used for and "Hani", "Jpan" that
    > are defined as multiple scripts or script variants) to subdivide it
    > with the new "extended script" property.
    >
    > It's true that for now, Unicode is unable to make distinctions between
    > "Hans" and "Hant" on just the encoded abstract characters (so for them
    > we have "sc=Hani" only, but an "extended script" property could make
    > more precise mappings, without being completely bound to the stability
    > policy).
    >
    > But it does not mean that texts and localization resources can't make
    > such distinctions by external tagging, or in stylesheets, or in
    > romanization schemes. And librarians (and book readers) already make
    > distinctions as well between Eastern and Western versions of the
    > unified Arabic.
    >
    > It could even have benefit within IDNA to help diagnose those digits
    > that have confusable forms in the two variants (even if there's a work
    > in progress for defining the confusables needed for IDNA), and adding
    > the extra ISO 15924 codes (for Arabic variants) won't break Unicode
    > (after all there are already variants for Latin and Sinograms, exactly
    > because of these "font variants").
    >



    This archive was generated by hypermail 2.1.5 : Thu Jul 29 2010 - 13:23:55 CDT