From: Peter Kirk (email@example.com)
Date: Fri Jan 16 2004 - 17:53:58 EST
On 16/01/2004 11:17, Rick McGowan wrote:
>The Unicode Technical Committee has posted a new issue for public review
>and comment. Details are on the following web page:
>Review periods for the new item closes on January 27, 2004.
>Please see the page for links to discussion and relevant documents.
>Briefly, the new issue is:
>Issue #27 Joiner/Nonjoiner in Combining Character Sequences
>Unicode 4.0 describes the structure of Khmer syllables, saying that they
>may contain an interior ZWJ. There is a problem with this that needs to be
>resolved in 4.0.1, because some of the characters later in the syllable can
>be combining characters. This paper describes a proposal with to fix this
>problem. As a part of the proposal, a choice has to be made among two
Although this issue has been brought up for review in the light of the
problem with Khmer, it also has a significant impact on Hebrew, and for
that reason I am bringing it to the attention of the Hebrew list as well.
I support the main proposal, which is to allow the ZWJ and ZWNJ
characters to occur within combining character sequences. When they
occur between two combining marks, they will indicate joining and
non-joining forms respectively of those two combining marks. In Hebrew,
this will provide a convenient mechanism for requesting or inhibiting
ligatures between meteg and hataf vowels (see
3.5). Previously there was no such mechanism which was strictly
compatible with Unicode definitions. With this change, the following
distinctions can be made:
<vowel, ZWJ, meteg> - medial meteg preferred, but only possible if the
vowel is a hataf vowel (ZWJ must be ignored for other vowels)
<vowel, ZWNJ, meteg> - left meteg preferred
<vowel, meteg> - no preference, font default should be used (probably
left meteg with all vowels)
<meteg, CGJ, vowel> - right meteg preferred - or should this last one be
<meteg, ZWNJ, vowel>, considering that ZWNJ will have the same effect as
CGJ of blocking canonical reordering?
I have a small concern that at least potentially there might be a need
to promote or inhibit a ligature between combining marks which do not
come together in canonical order. For example, in principle a single
Hebrew base character might be combined with a hataf vowel (ccc 11-13),
dagesh (ccc 21) and meteg (ccc 22). In canonical order the dagesh would
be reordered between the hataf vowel and the meteg, either before or
after ZWJ/ZWNJ, and would interfere with the mechanism. It might be
necessary to code <dagesh, CGJ, hataf vowel, ZW(N)J, meteg> or <hataf
vowel, ZW(N)J, meteg, CGJ, dagesh>. No such combination actually occurs
in the standard text of the Hebrew Bible, but in principle one might be
found in other texts.
At first sight I see no reason to express a preference between option A
or option B in the review issue, for Hebrew or any other reason.
Please note the following if you wish to make official feedback to the
UTC on this matter.
>If you have comments for official UTC consideration, please post them by
>submitting your comments through our feedback & reporting page:
>If you wish to discuss issues on the Unicode mail list, then please use
>the following link to subscribe (if necessary). Please be aware that
>discussion comments on the Unicode mail list are not automatically recorded
>as input to the UTC. You must use the reporting link above to generate
>comments for UTC consideration.
>Let me take this opportunity also to remind everyone that the closing date
>for comment on several other public review issues is approaching, so if
>you have comments, please try to send them in soon.
>Note: If you are a liaison representative, please forward this message as
>appropriate within your organization.
> Rick McGowan
> Unicode, Inc.
-- Peter Kirk firstname.lastname@example.org (personal) email@example.com (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Fri Jan 16 2004 - 18:34:04 EST