William Overington wrote:
> Regarding Ken's response to the Byzantine legal codes matter, it would
> appear possible that the way that the ts ligature with a dot above for
> romanization of Cyrillic could be represented in Unicode
> would be by the following sequence.
> t U+FE20 s U+FE21 U+0307
I think that <U+0307> would only apply to <s U+FE21>, not to the whole
sequence <t U+FE20 s U+FE21>.
Using the COMBINING DOUBLE INVERTED BREVE doesn't make things much better:
t U+0361 s U+0307
Still, <U+0361> only applies to <t>, and <U+0307> only applies to <s>.
Perhaps, a viable approach could be using the COMBINING GRAPHEME JOINER (to
turn <ts> into a single 'grapheme'), and then use regular combining marks
(as opposed to the "double" clones):
t U+034F s U+0311 U+0307
> In the recent thread about Byzantine legal codes, the
> following sequences were suggested.
> U+0069 U+0313 U+0301
> U+0055 U+0313
> The second of the above requiring a rendering different from
> what direct reading of the Unicode specification might suggest.
I don't think Unicode really 'specifies' this: it is a glyph issue, and the
details of it are left to the typographers.
The only thing I found about is on page 27 of the Unicode standard, which
simply states that this behavior may exist in some scripts, and gives an
example with polytonic Greek:
"Prominent characters that show such override behavior are
associated with specific scripts or alphabets. For example, when used with
the Greek script, the "breathing marks" U+0313 COMBINING COMMA ABOVE (psili)
and U+0314 COMBINING REVERSED COMMA ABOVE (dasia) require that, when used
together with a following acute or grave accent, they be rendered
side-by-side above their base letter rather than the accent marks being
stacked above the breathing marks."
But that passage is not really a "specification", so I don't think it needs
Of course, it could be enhanced by changing "specific scripts or alphabets"
into "specific scripts, alphabets, or languages". And Nick Nicholas's
cross-alphabet example ("Ulpianus") could be inserted as well.
> I wonder if consideration could please be given as to whether
> this matter should be left unregulated or whether some level
> of regulation should be used.
It seems to me that the well-known motto "Unicode encodes characters, not
glyphs" implicitly includes an answer to this: as far as possible, the
matter should be left unregulated.
This archive was generated by hypermail 2.1.2 : Wed Sep 18 2002 - 05:27:41 EDT