Re: List of ligatures for languages of the Indian subcontinent.

From: William Overington (
Date: Tue Mar 18 2003 - 06:04:02 EST

  • Next message: Marco Cimarosti: "RE: List of ligatures for languages of the Indian subcontinent."

    Thank you for your comments.

    I am not going to attempt to produce the list of ligatures myself.

    I am writing the paper to draw attention to the problem which exists in
    relation to the DVB-MHP (Digital Video Broadcasting - Multimedia Home
    Platform) system of interactive broadcasting and its application to the
    languages of the Indian subcontinent and hopefully provide a software format
    for resolving it..

    It appears that the software requirement is essentially as follows, if one
    wishes to use a font-based method of display with an ordinary font.

    Receive a stream of input characters encoded in regular Unicode UTF-16
    format suitable for processing as Java char items.
    Output a local stream of Java char suitable to be used in a Java drawString
    method with an ordinary font.
    As far as I can tell at present, the eutocode typography file format could
    be used to produce char codes for conjunct forms and for dealing with matras
    by scanning whole words, in that the changes needed seem always to be within
    a word and that there is no carry over to a following word.
    The discussion has led me to believe that it would be helpful for me to add
    an additional possibility to a eutocode typography file, using two presently
    unallocated codes.  I have not yet finally decided which particular two yet,
    nor finalized their definitions, as I am open to any suggestions for
    improvement, yet here is the idea.  For the moment I refer to them as U+EBEX
    and U+EBEY.
    A line in a eutocode typography file could have a line as follows.
    sequence1 U+EBEX sequence2 U+EBEY sequence3
    The spaces in the above line are for setting out the line clearly here, in
    use the spaces would not be there.
    Such a line would have the meaning as follows.
    Carry out the replacement
    sequence2 U+EBEF sequence3
    if and only if sequence1 matches the sequence stored in the language choice
    The sequence1 sequence is expressed using none or more characters from the
    range U+0020 to U+007E and is the decoded result of the latest use of a
    sequence of plane 14 language tags.  The idea is that the plane 14 tags
    would be used to signal particular languages, represented as in
    international standards, though the eutocode typography file will only
    define "a sequence" as such, not compliance with any list of languages.
    Would this be sufficient to provide a way to guide a Java program to produce
    an output stream of Java char to use to access an ordinary font in order to
    render languages of the Indian subcontinent, provided that a eutocode
    typography file and a font were supplied?
    I recognize that the preparing of the eutocode typography file and the
    ordinary font containing the glyphs is a large task and I am not going to
    try to do it myself.  However, if I can publish a software format which has
    the capability to solve the problem and can draw attention to the need to
    prepare the list and to prepare fonts which implement the list in part or in
    full together with eutocode typography files which can be used so that the
    fonts can be applied in applications, and can also produce a wish for the
    list to be a published open resource with a view to helping interoperability
    then I feel that that is about as far as I can go in this topic at the
    moment.  However, I do feel that acting now may well be beneficial as a well
    known infrastructural method will be available for consideration when people
    want to produce such displays on interactive television displays.
    This is but one of a number of ideas for techniques to use in content
    authorship for the DVB-MHP platform.
    In relation to the font of colour codes downloadable from the following
    I have now produced a test version which includes those colour codes and
    also four for point size and 28 others for various aspects of access level
    multimedia authoring.  This includes codes for variations of object
    replacement character defined within the Private Use Area.  One is OBJECT
    REPLACEMENT CHARACTER SYNONYM because trying to place a U+FFFC into some
    wordprocessors can cause problems if the wordprocessor also accepts graphics
    and uses U+FFFC for that.  The others are OBJECT REPLACEMENT CHARACTER with
    left, centre and right alignment.  The rest are mostly to do with producing
    a basic programmed learning capability within a plain text file, including
    such items as GREEN MARKER and so on so that when a push button is pushed
    all input characters are skipped until a marker of the corresponding colour
    is reached.  There are also a SKIP UNTIL CONTINUE and a CONTINUE MARKER so
    that programmed learning layouts following simple flow charts may be
    expressed in a sequential manner within a file.
    Thank you for your interest in reading through all of this posting.  I have
    recently produced an ornaments font, which I am hoping to write up
    for the web, and wonder if you might like a copy.  It is not a Unicode font,
    so that it can be easily used with the Paint program, though I am
    considering a Unicode version yet wondering quite how best to encode it.
    William Overington
    18 March 2003

    This archive was generated by hypermail 2.1.5 : Tue Mar 18 2003 - 06:39:20 EST