Re: Taiwanese: unicode of o with dot right above

From: Abdul Malik (AbdulMalik@btinternet.com)
Date: Sat Aug 12 2000 - 04:45:52 EDT


Here's my thoughts on the matter:

Kiatgak wrote:

> I can not find the unicodes of the 2 base characters:
> "O/o with a dot right above",

> what unicodes should be used to represent them?

> 1. U+0186/U+0254 (LATIN CAPITAL/SMALL LETTER OPEN O)
> with alternative form in font design.

No. This solution could possibly be used if "O/o with dot to the right and
above" was considered to be the Taiwanese glyph variant of O/o. but that
seems not to be the case.

> 2. U+004F/U+006F(O/o) + U+00B7(MIDDLE DOT)
> with the GSUB to fix the outlooks in font design.

[GSUB / GPOS to create the desired appearance]

No, Middle dot is a *Middle* dot not an 'upper' dot.

> Is it a valid sequence if a combining character follows them, eg.
> U+004F/U+006F(O/o) + U+00B7(MIDDLE DOT) + U+0301(COMBINING ACUTE
> ACCENT)

A combining character should really be used for the dot. Using a combining
character will ensure that this sequence is counted as a single character
unit.

> 3. U+004F/U+006F(O/o) + U+05C1(HEBREW POINT SHIN DOT).
> To use it is only based on the outlook.

[It appearance would be correct]

> One more serious problem: is a glyph with 2 scripts (Latin and Hebrew)
> allowed in unicode?
> Is it allow in Truetype?

There are problems in mixing characters from different scripts to create one
new character. Applications will probably be looking at the Unicode range of
characters to apply appropriate handling. Hebrew is written from right to
left so who knows what may happen in this case. Also, the use of the shin
dot is restricted as mentioned in the standard (p186)

> 4. U+004F/U+006F(O/o) + U+031B (COMBINING HORN) or precomposed ones
> U+01A0/U+01A1(LATIN CAPITAL/SMALL LETTER O WITH HORN).

Hmm... see 1.

> 5. To apply a new combining character.

I'm surprised you didn't suggest the two solutions that came to my mind:

1. U+004F(O) / U+006F(o) + U+02D9 (˙) -> O˙ , o˙ (DOT ABOVE)

2. U+0307(O) / U+006F(o) + U+0307 (̇) -> Ȯ , ȯ (COMBINING DOT ABOVE)

Solution 1 gives the desired appearance due to the non combining effect of
the dot but has the disadvantage of not being a combining character

Solution 2 doesn't give the desired appearance by default, but does use a
combining character.

Now using solution 2, an application, when given the fact that this is
Taiwanese text, could using an appropriate font, produce the correct image
(using GSUB etc.).
So failing any proposal this seems to be an answer.

In the proposal, submitted by Michael in 1997, it is mentioned that the
'combining dot above' has a similar appearance and probably the same
function as the proposed 'combing dot right above' so I assume that the fact
that my solution 2 could be used, was basis of the rejection.

> It is a long long way to go (and maybe there is no end).
> In fact, Te Khai-su and Michael Everson had applied on 1997-06-22, but
> their proposal
> was rejected(or withdrawn). But that proposal inquires many precomposed
> characters.
> If apply only a new combining character, will it be accepted?

If the semantic difference of 'combining right dot above' and 'combining dot
above' can be proved (I think they can) then I don't see why a new combing
character should not be proposed.

Abdul



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT