Re: Character identities

From: William Overington (
Date: Tue Oct 29 2002 - 01:32:15 EST

  • Next message: Barry Caplan: "Re: Character identities"

    John Hudson commented.

    >At 02:46 10/26/2002, William Overington wrote:
    >>I don't know whether you might be interested in the use of a small letter
    >>with an e as an accent codified within the Private Use Area, but in case
    >>might be interested, the web page is as follows.
    >>I have encoded the a with an e as an accent as U+E7B4 so that both
    >>may coexist in a document encoded in a plain text format and displayed
    >>an ordinary TrueType font.
    >If anyone were interested, he could do this himself and use any codepoint
    >in the Private Use Area.

    The meaning which I intended to convey was as follows.

    I don't know whether you might be interested in having a look at a
    particular example of the use of a small letter a with an e as an accent
    codified within the Private Use Area by an individual with an interest in
    applying Unicode, but in case you might be interested in having a look at
    that particular example, the web page is as follows.

    If, following from your response to the way that you read my sentence,
    someone were interested in defining a codepoint in the Private Use Area then
    certainly he or she could do that himself or herself and use any codepoint
    in the Private Use Area.

    However, exercising that freedom is something which could benefit from some

    If someone wishes to encode an a with an e as an accent in the Private Use
    Area, he or she may wish to be able to apply that code point allocation in a
    document. If he or she looks at which Private Use Area codepoints are
    already in use within some existing fonts, then selecting a code point which
    is at present unused in those fonts might give a greater chance of his or
    her new character assignment being implemented than choosing a code point
    for which those fonts already have a glyph in use.

    Searching through such fonts takes time and requires some skill.

    If someone does wish to use a Private Use Area code point for an a with an e
    accent, then by using U+E7B4 does give a possible slight advantage in that
    the code point is already part of a published set of code points available
    on the web, for, even though that set of code points is not a standard, it
    is a consistent set and other people might well use those codepoints as
    well. However, anyone may produce and publish such a set of code point
    allocations of his or her own if he or she so wishes, or indeed keep them to
    himself or herself.

    Yet I was not seeking to make any such point in my posting. I simply added
    to a thread on a specialised topic what I thought might be a short
    interesting note with a link to a web page at which some readers might like
    to look. The web page indeed provides two external links to interesting
    documents on the web.

    >Maybe it is time to include a note in the Unicode
    >Standard to suggest that 'Private' Use Area means that one should keep it
    >to oneself ....

    Well, at the moment the Unicode Standard does include the word publish in
    the text about the Private Use Area.

    I have published details of various uses of the Private Use Area on the web
    yet not mentioned them in this forum. For example, readers might perhaps
    like to have a look at the following.

    Anyone who chooses to do so might like to have a look at the following file
    as well, which introduces the application area.

    This is an application of the Unicode Private Use Area so as to produce a
    set of soft buttons for a Java calculator so that the twenty hard button
    minimum configuration of a hand held infra-red control device for a DVB-MHP
    (Digital Video Broadcasting - Multimedia Home Platform) television can be
    used in a consistent manner to signal information from the end user to the
    computer in the television set. I am very pleased with the result. The
    encoding achieves a useful effect while being consistent for information
    handling purposes with the Unicode specification, so that an input stream of
    characters may be processed by a Java program without any ambiguity over
    whether a particular code point is a printing character or a calculator
    button (or indeed mouse event or simulated mouse event as mouse events are
    also encoded using the Private Use Area in my research).

    William Overington

    29 October 2002

    This archive was generated by hypermail 2.1.5 : Tue Oct 29 2002 - 02:14:55 EST