RE: Encoding the fish symbol

From: Philippe Verdy (
Date: Sun Sep 30 2007 - 16:51:32 CST

  • Next message: Mike: "Re: New Public Review Issue: Proposed Update UTS #18"

    John Hudson wrote:
    > So what I'll be looking for, to make a case for encoding the Christian
    > fish symbol, is its
    > use in contexts where people may wish to perform text-specific operations
    > relative to the
    > symbol, e.g. searching a corpus of documents for occurences.


    > * But then I question whether quite a few of the Miscellaneous Symbol
    > characters really need to exist.

    They only exist because they are needed for interoperability of countless
    documents referencing them only by legacy encodings, for which there's a
    reasonnable need of a standard mapping to the various separate fonts needed
    to render the document correctly (as these documents do not embed any glyph
    outline for these symbols).

    If there's now some industry standard that allows documents using some
    additional symbols to be encoded directly without their associated glyphs
    (outlines or bitmaps, it does not matter), and such data starts being
    disseminated, the need for interoperability within applications that need to
    support multiple standards using different encodings or only supporting
    Unicode, militates in favour of encoding these symbols with a unique
    interoperable mapping for these applications.

    For example, the proposed additional symbols in the JIS standard, for use in
    broadcasting of traffic information to be displayed on various kinds of
    device (electronic street indicators and panels, mobile devices in cars like
    satellite GPS systems, RDS info on FM radios...) are good candidates for
    encoding, as such broadcasted data will need to be prepared and processed in
    standard computers using standard softwares supporting for example only

    Having to rely only on PUAs (or on ambiguous mappings with multiple possible
    mappings not really intended in the Unicode standard where these symbols
    were not initially unified) would not make this industry standard very
    interoperable, notably if the industry standard definining these symbols is
    not directly supported by the other existing applications used to prepare
    this data.

    For this reason, if a fish symbol is needed in modern applications (for
    example in the newspaper industry to interchange data about classified ads
    referencing these symbols, using existing data formats not designed to
    transport glyph outlines as they only support plain-text and no upper-layer
    protocol, for example in the table fields of a relational database), then
    the need for interoperability (not relying on PUAs) will create the need for
    the encoding.

    One could say that newer applications should always be prepared in a way
    that they will an upper-layer protocol to be implemented in their supported
    data format, but this is rarely the case, except for the choice of file
    formats used for paged documents (created in word processors or web page
    editors). But textual contents are not restricted to only these kind of
    documents intended for immediate rendering, there's much more textual data
    within lots of protocols where only plain-text is reliably supported.

    Suppose you want the symbol to be transported to a mobile phone using the
    existing and convenient short messaging infrastructure: it will probably
    support Unicode, but no embedded transport of glyphs, so the mobile phone
    would not be able to display the message through that infrastructure,
    without changing completely its technology and rebuilding the transport
    infrastructure and its conversion gateways. In addition, you would have to
    convince all the existing customers to replace and buy new terminals,
    compatible with the newer standard: they won't be easy to convince if that's
    just for supporting a few additional symbols.

    This archive was generated by hypermail 2.1.5 : Sun Sep 30 2007 - 16:55:04 CST