Re: Standaridized variation sequences for the Desert alphabet?

From: Martin J. Dürst <>
Date: Mon, 27 Mar 2017 16:05:12 +0900

On 2017/03/27 01:20, Michael Everson wrote:
> On 26 Mar 2017, at 16:45, Asmus Freytag <> wrote:

> Consider 2EBC ⺼ CJK RADICAL MEAT and 2E9D ⺝ CJK RADICAL MOON which are apparently really supposed to have identical glyphs, though we use an old-fashioned style in the charts for the former. (Yes, I am of course aware that there are other reasons for distinguishing these, but as far as glyphs go, even our standard distinguishes them artificially.)

"apparently", maybe. Let's for a moment leave aside the radicals
themselves, which are to a large extent artificial constructs. Let's
look at the actual characters with these radicals (e.g. U+6709,... for
MOON and U+808A,... for MEAT), in the multi-column code charts of ISO
10646. There are some exceptions, but in most cases, the G/J/K columns
show no difference (i.e. always the ⺝ shape, with two horizontal bars),
whereas the H/T/V columns show the ⺼ shape (two downwards slanted bars)
for the "MEAT" radical and the ⺝ shape for the moon radical. So whether
these radicals have identical glyphs depends on typographic
tradition/font/... In Japan, many people may be rather unaware of the
difference, whereas in Taiwan, it may be that school children get
drilled on the difference.

> One practical consequence of changing the chart glyphs now, for instance, would be that it would invalidate every existing Deseret font. Adding new characters would not.

Independent of whether the chart glyphs get changed, couldn't we just
add a note "also # in some fonts" (where # is the other variant). That
would make sure that nobody could claim "this font is wrong" based on
the charts. (Even if a general claim that the chart glyphs aren't
normative applies to all charts anyway.)

>> In fact, it would seem that if a Deseret text was encoded in one of the two systems, changing to a different font would have the attractive property of preserving the content of the text (while not preserving the appearance).
> Changing to a different font in order to change one or two glyphs is a mechanism that we have actually rejected many times in the past. We have encoded variant and alternate characters for many scripts.

Well, yes, rejected many times in cases where that was appropriate. But
also accepted many times, in cases that we may not even remember,
because they may not even have been made explicitly. Because in such
cases, the focus may not be on a change to one or a few letter shapes,
but the focus may be on a change of the overall style, which induces a
change of letter shape in some letters. The roman/italic a/ɑ and g/ɡ
distinctions (the later code points only used to show the distinction in
plain text, which could as well be done descriptively), as well as a
large number of distinctions in Han fonts, come to my mind. I'm quite
sure other scripts have similar phenomena.

>> This, in a nutshell, is the criterion for making something a font difference vs. an encoding distinction.
> Character identity is not defined by any single criterion. Moreover, in Deseret, it is not the case that all texts which contain the diphthong /juː/ or /ɔɪ/ write it using EW 𐐧 or OI 𐐦. Many write them as Y + U 𐐏𐐋 and O + I 𐐄𐐆. So the choice is one of *spelling*, and spelling has always been a primary criterion for such decisions.

This is interesting information. You are saying that in actual practice,
there is a choice between writing 𐐄𐐆 (two letters for a diphthong) and
writing 𐐧. In the same location, is 𐐆𐐋 (the base for the historically
later shape variant of 𐐧; please note that this may actually be written
𐐋𐐆; there's some inconsistency in order between the above cited
sentence and the text below copied from an earlier mail) also used as a
spelling variant? Overall, we may have up to four variants, of which
three are currently explicitly supported in Unicode. Are all of these
used as spelling variants? Is the choice of variant up to the author
(for which variants), or is it the editor or printer who makes the
choice (for which variants)? And what informs this choice? If we have
any historic metal types, are there examples where a font contains both
ligature variants?

(Please note that because 𐐄, 𐐆, and 𐐋 are available as individual
letters, it's very difficult to think about the two-letter sequences as
anything else than spellings, but that doesn't necessarily carry over to
the ligatures.)

And then the same questions, with parallel (or not parallel) answers,
for ɒɪ/ɔɪ/𐐦.

Regards, Martin.

Text copied from earlier mail by Michael:

1. The 1855 glyph for 𐐧 EW is evidently a ligature of the glyph for the
diagonal stroke of the glyph for 𐐆 SHORT I [ɪ] and 𐐅 LONG OO [uː],
that is, [ɪ] + [oː] = [ɪuː], that is, [ju].

2. The 1855 glyph for 𐐦 OI is evidently a ligature of the glyph for 𐐉
SHORT AH [ɒ] and the diagonal stroke of the glyph for 𐐆 SHORT I [ɪ],
that is, [ɒ] + [ɪ] = [ɒɪ], that is, [ɔɪ].

That’s encoded. Now evidently, the glyphs for the 1859 substitutions are
as follows:

1. The 1859 glyph for EW is evidently a ligature of the glyph for the
diagonal stroke of the glyph for 𐐆 SHORT I [ɪ] and 𐐋 SHORT OO [ʊ],
that is, [ɪ] + [ʊ] = [ɪʊ], that is, [ju].

2. The 1859 glyph for OI is evidently a ligature of the glyph for 𐐃
LONG AH [ɔː] and the diagonal stroke of the glyph for SHORT I [ɪ], that
is, [ɔː] + [ɪ] = [ɔːɪ], that is, [ɔɪ].
Received on Mon Mar 27 2017 - 02:06:10 CDT

This archive was generated by hypermail 2.2.0 : Mon Mar 27 2017 - 02:06:11 CDT