Re: Characters for Cakchiquel

From: William Overington (
Date: Sat Mar 29 2003 - 05:22:33 EST

  • Next message: Frank da Cruz: "Missing native-script country names"

    David Starner wrote as follows.


    What good would a private use character do here? The private use area is
    good for Tengwar, Cirth and Shavian (all of which have multiple fonts
    using the same private use area encoding.) But there's no huge demand to
    interchange data with these characters, and the few users are probably
    going to use something less complex then the private use area. Assuming
    I scan this book in for Project Gutenberg, I'll probably use something
    like [3], [4], [4,] and [4,h] for the characters, at least in the ASCII
    version (and there'd be no reason to post a Unicode version if these
    characters aren't in Unicode.) It's simple, readable and precise,
    something your solution only has one of.

    end quote

    The size of the "demand" does not affect my suggestion. It is a matter of
    research in encoding scripts.

    Using the Private Use Area is not at all complex. If one has a font
    containing some Private Use Area characters the font can be used quite
    straightforwardly from within a program such as Microsoft Word using the
    Insert | Symbol facility which that program provides. Characters can also
    be accessed fairly straightforwardly from a program such as some issues of
    Microsoft WordPad using an Alt sequence, by holding down the Alt key and
    entering a decimal code point value using the number keys at the right of a
    PC keyboard, then releasing the Alt key.

    Project Gutenberg is a very valuable project. I have recently started
    reading the notebooks of Leonardo da Vinci which is available at Project
    Gutenberg. For a file using ASCII text, using [3], [4], [4,] and [4,h] for
    the characters is probably the only type of method available for the work
    and probably quite suitable as it gets the job done within the limits of the
    available technology.

    However, consider that that file could be processed using a short Pascal
    program using a eutocode typography file. The Pascal program would read in
    ASCII text and output a Unicode text file, converting certain character
    sequences or individual characters.

    The particular eutocode typography file for the conversion of the format
    which you suggest would only need five lines of a few characters each,
    provided that the figures 3 and 4 were only used within the file for those
    symbols and not as digits, unless you intend using the [ and ] characters as
    well as the digits. However, a Private Use Area encoding of the special
    symbols would be needed and a font to display them.

    If a Private Use Area encoding is produced for the special symbols used by
    missionaries in the fifteenth and sixteenth centuries, then a few fonts,
    from various fontmakers, might include the special symbols. Thus your
    suggested ASCII file for Project Gutenberg could be used to produce print
    outs using the correct special symbols if desired.

    The eutocode typography file format is described in the following document.

    You suggest that using a Private Use Area encoding has only one of the three
    attributes of simplicity, readability and precision.

    I feel that a Private Use Area encoding could be reasonably simple. Care
    could be taken to make it as logically structured as possible within the
    limits of an on-going, a bit done now and then type activity, of adding in
    code point allocations as symbols are found in the literature. The
    experience gained could be useful when promotion to regular Unicode is
    considered formally, when the order of encoding used in the Private Use Area
    character set could be changed around as desired so as to produce a formal

    Certainly, without a suitable font readability would be a great problem.
    Yet once a Private Use Area encoding is published, font support may follow.

    A Private Use Area encoding can be precise provided that both the originator
    of a document using the encoding and the user of that document both know
    what is the encoding and that both have suitable facilities for applying the
    file which contains the document. In such circumstances the Private Use
    Area can be of great precision and very useful. For example, readers might
    like to have a look at the font COURTCOL.TTF which is described in, and
    downloadable from, the following web page.

    Readers might like to have a look at the way in which I have expressed the
    colours in monochrome then perhaps search at and other
    search engines using the two words Petra Sancta together for the search.

    I have tried some offline experiments with a Java applet and the results are
    good. I have also produced a font with 51 glyphs which includes those 19
    glyphs and others for four sizes of type, various object replacement
    characters, various wait for push button push codes and various markers for
    producing a programmed learning package encoded within a text file using
    WordPad. Precision is essential for such an activity and the Private Use
    Area is used to provide that precision.

    Actually, I was rather hoping that the start of a Private Use Area encoding
    might be produced by a few interested people fairly quickly, perhaps in this
    thread or in some email correspondence. Once that is done, then font
    support could gradually be produced.

    William Overington

    29 March 2003

    This archive was generated by hypermail 2.1.5 : Sat Mar 29 2003 - 06:18:32 EST