From: Mark Davis (firstname.lastname@example.org)
Date: Sun Nov 09 2003 - 14:11:58 EST
Let's try to be clear on the terms.
Look at the definition of combining sequences:
D17 Combining character sequence: A character sequence consisting of either a
base character followed by a sequence of one or more combining characters, or a
sequence of one or more combining characters.
Thus a combining character sequence *cannot* contain a ZWJ or any other Cf.
Any use of a ZWJ before a combining mark produces a *defective* combining
character sequence (D17a), which isolates the combining mark from any preceeding
And as I said earlier:
> - *Default* grapheme clusters do not include ZWJ; as a matter of fact, default
> grapheme clusters, except for Hangul Jamo Syllables and a few exceptional
> are identical with combining sequences.
> - *Tailored* grapheme clusters may include longer sequences, but it is not at
> all obvious whether they would contain ever ZWJ or ZWNJ.
I'll expand on the latter. What constitutes a tailored grapheme cluster is up to
a particular process, and so one could contain a ZWJ. However, any combining
mark after a ZWJ does *not* apply to a previous base character within that
tailored grapheme cluster, so the use of a ZWJ would isolate that combining
mark. Such a sequence would not correspond to anything used in a natural
► शिष्यादिच्छेत्पराजयम् ◄
----- Original Message -----
From: "Peter Kirk" <email@example.com>
To: "Mark Davis" <firstname.lastname@example.org>
Cc: "Unicode List" <email@example.com>
Sent: Sun, 2003 Nov 09 09:19
Subject: Re: ZWJ, ZWNJ, CGJ and combination
> On 08/11/2003 17:09, Mark Davis wrote:
> >I agree with the first part of your analysis. By the phrase "requesting
> >of combining characters" it is unclear to me what you mean, and whether that
> >the right solution to whatever problem you are referring to.
> >► शिष्यादिच्छेत्पराजयम् ◄
> A further reply to this one:
> On the bidi list Paul Nelson pointed out that in Khmer ZWJ and ZWNJ do
> not break combining sequences; or at least they do not break grapheme
> clusters, which is not quite the same thing. And the same may be true of
> Indic scripts, although in the examples I found ZWJ/ZWNJ is always at
> the end of a combining sequence. Are ZWJ and ZWNJ actually used within
> combining character sequences (or what would be such sequences if not
> technically broken)? Is there some tension here with the general
> definition of combining character sequences?
> If Khmer really does do this, and unless there are any real objections
> to this practice, perhaps the best way ahead, rather than defining a new
> COMBINING CHARACTER JOINER and changing the Khmer encoding, is to adjust
> the definition of combining character sequences to allow ZWJ, ZWNJ and
> perhaps some other suitable layout control characters to be included
> within such sequences. This would allow the Hebrew issue to be solved in
> a way analogous to the Khmer issue.
> Peter Kirk
> firstname.lastname@example.org (personal)
> email@example.com (work)
This archive was generated by hypermail 2.1.5 : Sun Nov 09 2003 - 14:57:36 EST