From: Philippe Verdy (
Date: Fri Nov 26 2004 - 16:12:24 CST

  • Next message: Doug Ewell: "Re: CGJ , RLM"

    From: "Doug Ewell" <>
    > Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
    >> If I want to encode explicit ligatures for the "ffi" cluster, if it is
    >> not hyphenated, I need to add ZWJ:
    >> "ef"+ZWJ+SHY+"f"+ZWJ+"i"+SHY+"ca"+SHY+"ce" (1)
    > Great Scott! You can use ZWJ to suggest a ligation opportunity, and SHY
    > to suggest a hyphenation opportunity, but if you need to suggest both
    > within the same word, let alone *between the same pair of letters*, you
    > have probably stepped over the plain-text line.

    If encoding ligation oportunity is not plain-text, why then have it in
    If hyphenation opportunity is not plain-text, why then have it in Unicode?

    Both exist in Unicode, and I don't think that they are considered not
    plain-text. So why would you want to restrict their usage so that they will
    be used only separately?

    The ZWJ and SHY format controls for these two targets are added on purpose
    when preparing documents for later rendering. They shouldn't affect the
    collation of text and will not change their semantic, and this
    transformation of text cannot be fully automated without using complex
    lexical and linguistic knowledge. That's why they should be allowed in texts
    kept for archiving.

    If you want to use later those prepared texts on more simpler renderers and
    parsers, you can still ignore and filter out the ZWJ and SHY very easily, so
    this preparation work, performed most often by typists, is normally

    Nobody is required to use them, but if one wants to do it for better
    rendering of prepared documents, why would Unicode forbid it? Was my
    question really so stupid?

    This archive was generated by hypermail 2.1.5 : Fri Nov 26 2004 - 16:13:08 CST