Re: Medievalist ligature character in the PUA

From: Julian Bradfield (jcb+unicode@inf.ed.ac.uk)
Date: Tue Dec 15 2009 - 04:31:56 CST

  • Next message: verdy_p: "Re: Medievalist ligature character in the PUA"

    On 2009-12-14, Michael Everson <everson@evertype.com> wrote:
    > On 14 Dec 2009, at 20:56, Julian Bradfield wrote:
    >>[...]
    > Evidently I was not using [identify] in a technical sense.

    The technical sense is also the normal English sense. Things are
    "identical" if they're exactly the same.

    >> What you presumably mean is "the space in which filenames live
    >> *ought* to be the set of utf-8 strings quotiented by canonical
    >> equivalence" (so that two canonically equivalent strings are
    >> representatives of one and the same filename).
    >
    > No, that's not what I meant.
    >
    > I meant that é 00E9 and é 0065 0301 the same platonic entity (acute
    > e) in an intrinsic sense, whereas both are different from a Cyrillic
    > lookalike, е́ 0435 0301.
    >
    > *That* kind of identity.

    How does what you said differ from what I said, except that I said it
    precisely? Your "platonic entity" is my "equivalence
    class of UTF-8 strings under canonical equivalence". That defines an
    identity on the "platonic entities", NOT on the UTF-8 strings.

    As Asmus has pointed out, the question then is, do you ask users to
    understand this, and magically know that two apparently different
    strings are actually the same?
    If they're Windows users, they're used to this, because of the mess
    with case of filenames in FAT, but if they're Unix users, they're not
    at all used to it.
    On the other hand, the complexities of dealing with Unicode
    equivalence are a whole different league from dealing with simple case
    collapsing.
    I don't know what the right answer is - except to agree that it ought
    to be possible for a file system to be marked as only allowing UTF-8
    filenames, in some normalized form.

    -- 
    The University of Edinburgh is a charitable body, registered in
    Scotland, with registration number SC005336.
    


    This archive was generated by hypermail 2.1.5 : Tue Dec 15 2009 - 04:33:52 CST