Re: What is the principle?

From: jameskass@att.net
Date: Sun Mar 28 2004 - 21:12:14 EST

  • Next message: jameskass@att.net: "Re: Printing and Displaying Dependent Vowels"

    Asmus Freytag wrote,

    > While applications predating VSs have no choice but to treat them as what
    > they are (in that context) i.e. unassigned characters, applications of later
    > date have no business treating unapproved VS sequences as unassigned
    > *characters*.
    >
    > The intent of VSs is to mark a difference that falls below the distinction
    > between separately encoded characters. Therefore I would expect that by default
    > all VS charactesr are ingnored in an fullblown collation implementation,
    > leaving
    > open the choice of supporting, say, a fourth level difference between specific
    > known variation sequences.
    >
    > They are also best ignored in any kind of identifier or name matching, as
    > otherwise
    > the presence of invisible characters can change the lookup--with all the
    > consequences
    > for spoofing and security.

    What you're saying makes perfect sense for purposes of forwards
    compatibility. Thanks to both you and Ernest Cline for pointing
    this out.

    I'd prefer to see some kind of toggle for file/archive searching with
    respect to ignoring VS characters, but can't argue with ignoring them
    for security/spoofing issues. Otherwise, the spam problem might well
    become even worse.

    Good collations are tailorable, so if the default condition is for
    collation to ignore VS characters, that shouldn't make problems for
    anyone.

    Best regards,

    James Kass

    > At 07:53 PM 3/27/2004, jameskass@att.net wrote:
    >
    >
    > > > >What does the collation standard say to do with unassigned codepoints
    > > > >anyhow?
    > > >
    > > > Variation selectors are not unassigned characters.
    > >
    > >But, they might be regarded as such by any application predating VSs. And,
    > >likewise for any VS sequences approved after the application was created.
    >
    > While applications predating VSs have no choice but to treat them as what
    > they are (in that context) i.e. unassigned characters, applications of later
    > date have no business treating unapproved VS sequences as unassigned
    > *characters*.
    >
    > The intent of VSs is to mark a difference that falls below the distinction
    > between separately encoded characters. Therefore I would expect that by default
    > all VS charactesr are ingnored in an fullblown collation implementation,
    > leaving
    > open the choice of supporting, say, a fourth level difference between specific
    > known variation sequences.
    >
    > They are also best ignored in any kind of identifier or name matching, as
    > otherwise
    > the presence of invisible characters can change the lookup--with all the
    > consequences
    > for spoofing and security.
    >
    > A./
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Sun Mar 28 2004 - 21:57:58 EST