Re: Private-use agreements (was: Re: Emoji: emoticons vs. literacy)

From: Doug Ewell (doug@ewellic.org)
Date: Sun Jan 04 2009 - 17:40:47 CST

  • Next message: Asmus Freytag: "Re: Emoji: emoticons vs. literacy"

    Hans Aberg <haberg at math dot su dot se> wrote:

    > The idea would be that the hash code is for a file containing all
    > information needed for its use, including typesetting - perhaps
    > including some default glyph. Then the hash code should be rich enough
    > making it unlikely that independently made private files have the
    > same - they need then not check if it already exists. One then need
    > some URL to search for it, but the URL need not be fixed - any one
    > with search capabilities will suffice.

    I understand the purpose of hash functions. They are valuable for
    verifying data integrity, to make sure that a local file is genuine and
    has not been corrupted or maliciously altered.

    For a Unicode private-use agreement, however, a much more likely use
    case is that the user (1) needs to know which agreement is in place and
    (2) needs access to it. In this case, the hash value is useless without
    the file itself. Verification against malicious tampering is unlikely
    to be an issue, and there may be a legitimate reason for the private
    agreement to be updated or otherwise changed, so that a mismatched hash
    value does not indicate a genuine problem.

    For the Ewellic alphabet, the "private agreement" is the relevant page
    on the ConScript Unicode Registry Web site. This page contains
    additional things like links and copyright notices and CSS style that
    can be, and have been, changed without affecting the substance of the
    private agreement. In this case, a hash value would be neither
    necessary nor sufficient to identify the agreement.

    > I am aware of that it is sort of is against the current Unicode
    > principles. But the PUA characters, except for temporary private use,
    > will otherwise be quite unusable. Especially if one gets a file a few
    > years old, and it uses some private characters, it may be quite
    > impossible to read it. By contrast, archiving is for these practical
    > purposes unlimited. So if there is an convenient search method, it
    > will be easy to read such a file.

    Well, that is one of the risks associated with private-use. There is no
    standard format for private agreements, and documents on the Web are not
    exactly guaranteed to last forever. MUFI has a pretty well defined
    private agreement, and ConScript has another, and some of the mapping
    tables from Apple on the Unicode site constitute another. But there is
    no standard index to these and no standard way to cite which one is in
    use for a given document, and it seems unlikely that Unicode will get
    involved in this.

    The URL of a private agreement could be embedded directly within the
    PUA-using document, or it could be stored in a short accompanying file
    or made available from the Web site from which the document can be
    downloaded. URL shorteners such as tinyurl.com and is.gd could be used
    to reduce the overhead of storing these links, although not all programs
    and users accept such URLs, for obvious security reasons.

    --
    Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
    http://www.ewellic.org
    http://www1.ietf.org/html.charters/ltru-charter.html
    http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ
    


    This archive was generated by hypermail 2.1.5 : Sun Jan 04 2009 - 17:45:05 CST