Re: Regulating PUA.

From: Richard Wordingham (
Date: Fri Jan 26 2007 - 15:51:20 CST

  • Next message: Richard Wordingham: "Re: ZWJ, ZWNJ and VS in Latin and other Greek-derived scripts"

    Philippe Verdy wrote on Friday, January 26, 2007 2:10 AM

    > Hmmm... Are all textual data on the web (or even just in a HTML page)
    > really associated to <font> elements or font style?
    > We have been said by W3 that style was designed to be clearly separated
    > from the textual content.

    It would indeed be pleasant if there were some way of defining the meaning
    and properties of PUA characters. Unfortunately, there doesn't seem to be.
    In this case, font seems to be the most practical way of identifying the
    meaning of a PUA character. After all, a recipient may in principle freely
    switch between conventions, depending on whom he is communicating with.

    Don't forget that we live in a world where many HTML documents don't even
    specify their distinctly non-default encoding.

    > But then, it should work also in other invisible parts. Forexample, in
    > Javascript: how can a string keep internally the reference to the
    > convention associated to the PUA characters it contains? Is this
    > association mutable for each PUA character (it should not,because it is
    > part of the character identity!)? How can we compose strings containing
    > PUAs from different sources?

    The same way we string words from different languages together? In the PUA,
    character identity depends on the 'agreed' convention, and it is not
    impossible, though undesirable, that one may switch conventions within a
    single document.

    Naturally, composing strings using different PUA conventions will generally
    yield nonsense - you do need an analogue of ISO-2022, for which you have
    suggested tag characters. That's not unreasonable. For example, you could
    have a tag value identifying the Conscript Registry, and I've stumbled over
    SIL's documentation of its conventions. In the meantime, I suppose *you*
    could always use a character from the PUA as the PUA convention tag
    character, with a suitable escaping mechanism, for strings *you* compose
    using multiple PUA conventions.


    This archive was generated by hypermail 2.1.5 : Fri Jan 26 2007 - 15:52:56 CST