Re: Arabic Implementation

From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Wed Aug 18 2004 - 04:01:53 CDT

  • Next message: Doug Ewell: "Re: Arabic Implementation"

    > After a character changes the display form into one mentioned
    > in Arabic
    > Presentation Form B does it still belong to a joining type.

    Nope. All the Arabic presentation forms implicitly have the joining
    type U (non-joining) [and the joining metagroup <no shaping>].

    > For example: Lets say Unicode Character : 0x0622 which is a
    > right joining
    > type , when this changes the display form into ISOLATED FORM
    > its Unicode
    > becomes : 0xfe81.

    You can base a *partial* implementation of DISPLAY of the Arabic
    script that way. But note that many of the more "exotic" Arabic
    letters do not have any corresponding presentation form characters.
    What one is supposed to do is to look up the presentation form
    *glyphs* in the ("smart") font. That does not rely on any of the
    presentation form characters. Nor should text be stored using the
    presentation form characters.

    > I personally feel that a particular character belonging to a
    > particular
    > joining type will have all its different display forms also
    > belonging to the
    > particular joining type .

    No. It is only the "nominal" Arabic letters that are "shaped".
    The preshaped ones already have their shape, and do not affect
    the shape of neighbouring characters. Note that a "shaper"
    based on using the presentation form characters, should also
    interpret ZWJ and ZWNJ, but may remove them after interpretation.
    (You should not store the resulting text beyond what is needed
    for display/print.)

                    /kent k



    This archive was generated by hypermail 2.1.5 : Wed Aug 18 2004 - 04:13:13 CDT