Re: Changing UCA primarly weights (bad idea)

From: Michael Everson (everson@evertype.com)
Date: Fri Jul 09 2004 - 15:25:25 CDT

  • Next message: Peter Kirk: "Re: Looking for transcription or transliteration standards latin- >arabic"

    Mark, your examples are all of the
    run-of-the-mill Scandinavian variety. Trotting
    out Polish and Danish doesn't address the issue.
    The issue is all the phonetic characters, and
    all the African ones (for instance).

    > > 1) it destabilizes the default tailorable template of ISO/IEC 14651
    > > and the UCA which has been published for some time. Anyone who *has*
    > > tailored it would have to do all that work all over again.
    >
    >You are certainly right that this is not a slam-dunk;

    This noun must have been on TV a lot in the US
    recently; I have seen it a lot but it remains
    obscure, apart from being a basketball reference.
    What does it mean? That I am right that the
    proposal is not a shoo-in? Or, indeed, that I am
    right that it is not a foregone conclusion that
    the proposal will be accepted?

    >there are reasons for
    >and against it. And it may well be that the committee decides against it.

    There are two templates, which are synchronized,
    and decided about by two committees.

    >What we actually did was to put similar letters
    >near other letters, *and if their decompositions
    >were the same* we interfiled them.

    I remember. I was on the committee that helped to decide these things.

    >There is, however, little principled difference
    >between , , , , , ?, and that would
    >cause a user to think that the some should be
    >interfiled and some should not. In some
    >languages these would be seen as "separate
    >letters" (e.g. with different primary weights)
    >and in others not; but that does not line up in
    >any particular way with what is in the UCA. (see
    >also comment below).

    Those aren't the ones I'm worried about, and they
    are not much of a problem. We had principles for
    determining "basic letters" and those are what we
    used; what I see now is a proposal to change that.

    >See http://www.unicode.org/charts/collation/chart_Latin.html for many other
    >cases.

    Please do. Do you really want all those letters
    between "e" and "f" interfiled with "e"? I surely
    do not.

    > > 3) in discussions elsewhere, Mark has talked about what "most users"
    >> "expect" and I found his suggestion to be anglocentric and
    >> unsubstantiated.
    >
    >And I will refrain from saying what I think of your reasoning ability in
    >general, although circularity seems to be a particular specialty.

    Sweet of you to say.

    >I suggest that we stick to the facts instead of ad hominem attacks.

    Calling a thing "ad hominem" doesn't make it ad
    hominem. It is your suggestion which I
    criticized, because it seems very A-to-Z and
    alien to the principles which have been in the
    template until now.

    >For user expectations, check out how foreign words with unusual accents are
    >sorted in a variety of languages. I have seen no reason to believe that
    >Germans or French or others behave much differently when faced with a letter
    >like that is not one that they use. The key is whether they would expect
    >to see:
    >
    >a) Interleaved:
    >..oa..
    >..b..
    >..oz..

    You can tailor for this now.

    >b) Separate but near:
    >..oz..
    >..b..
    >..pa..

    This is what we have now.

    >c) Like a particular language (Danish)
    >..yb..
    >..b..

    You can tailor for this now.

    My point is made here. It is really only in
    initial position where this is likely to be
    noticed. What I want is the status quo, however.
    Leave the template and its principles alone.

    >a) Interleaved:
    >..oa..
    >..b..
    >..oz..

    This is what we have now.

    >b) Separate but near:
    >..oz..
    >..b..
    >..pa..

    You can tailor for this now.

    >c) Like a particular language (Swedish or Phonebook German)
    >..yb..
    >..b..
    >
    >..od..
    >..z..
    >..of..

    You can tailor for this now.

    >More accurately, you believe that the correct behavior occurs.

    It is correct for most of the letters which would
    be affected by the change you propose. The
    overwhelming majority of the
    letters-without-diacritics which occur between
    the "main A-Z letters" are correctly filed that
    way, and would be incorrectly filed if interfiled
    with the "main" letters. Is there a discomfort in
    what happens between /? Well, that's an
    anomaly, right enough but it is well-known and
    can easily be tailored for anyone worried about
    it. Lumping all the Engs with N or all the Schwas
    with E, however, would have only the effect of
    making a working template cease to work for the
    people who really need those letters: linguists,
    speakers of African languages, and so on. The
    only people who use the sideways "o" and the top-
    and bottom-half "o" are Uralic linguists, and the
    template works correctly for them, at least for
    those letters.

    > > 5) if Mark wants to make a tailoring to interfile all these letters
    >> (which can only result in what I describe as "visual seasickess" to
    >> any poor users who have to actually read such wordlists.
    >
    >Again, no evidence.

    It was argued years ago in TC304 and WG20. I'm
    disheartened to have to reopen the arguments now,
    particularly as it affects stability and you
    yourself have been a champion for stability.

    >Let's look at a particular example, letters based on
    >"O". UCA *already* interleaves the list below (UCA O List). Adding John's
    >list to that would add only the two elements:

    John's list?

    > > 6) the Latin alphabet has a lot more than 26 letters in it. In this
    >> age of the Universal Character Set, "most users" would do better to
    >> get used to this than to be hobbled by older concepts.
    >
    >I agree with the general principle, but it has
    >no bearing on the topic at hand.

    It is the key to the principles which are in the template now.

    -- 
    Michael Everson * * Everson Typography *  * http://www.evertype.com
    


    This archive was generated by hypermail 2.1.5 : Fri Jul 09 2004 - 15:29:02 CDT