Re: Taiwanese: unicode of o with dot right above

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Aug 15 2000 - 13:04:10 EDT


Kiatgak asked:

> 3 more problems:
>
> 1. Can anyone kindly explain why COMBINING DOT ABOVE RIGHT was turned down?

I just did. Please read back through my explanation. COMBINING DOT ABOVE RIGHT
is a glyph variant of COMBINING DOT ABOVE that has been used in Latin
transcription of Minnanhua to solve the problem of tonal accent placement
on a Latin letter marked with a diacritic above. There is no need for
a separate encoded character -- the existing encoded characters are
sufficient, together with the use of appropriate fonts which show the
ligations correctly for Minnanhua.

>
> Ken said:
> > A new character, COMBINING DOT ABOVE RIGHT, should *NOT* be encoded.
> > It was already proposed two years ago, and has been considered and
> > turned down in the context of the issues raised by Kiatgak.
>
> I am still out of touch with Te Khaisu.
> Why COMBINING DOT ABOVE RIGHT was turned down? Is there anyone understanding
> the real reason?
> Was it rejected with other precomposed characters together?
> Or as Abdul said: "...'combining dot above' has a similar appearance and
> probably the same function as the proposed 'combing dot right above' ... was
> basis of the rejection."?

Yes, the latter.

>
> Intuitively, the appearances of DOT ABOVE and DOT ABOVE RIGHT are quite
> different (at least for me), arn't they?

This is no different than the need to adjust the position of combining
points for voweled Arabic, for example, depending on what letters they
are placed in proximity to.

Minnanhua fonts simply need to contain the appropriate preformed glyphs
that place the dot correctly with respect to the base form o.

>
> 2. If using O/o + U+0307(COMBINING DOT ABOVE) to represent "O/o with dot
> above right", how to make them have the desired appearance?

Just pick an appropriate font that has the expected glyphs built in
for Minnanhua. That is exactly what Te Khai-su's HOTSYS(r) fonts have
already.

> It is very nice to have normal Unicode encoding. But we hope the position of
> the dot should be to the right above. How?
> In normal environments, it will be displayed on the above, won't it?

Default rendering with default fonts would do so, yes.

>
> So, the best way seems to request font developers to support a
> language-specific behavior. But
> a) to get the help from font developers seems very unlikely (because this
> is not a general solution as adding a COMBINING DOT ABOVE RIGHT, just like
> Peter said).
> b) for the lesser-used languages including Taiwanese etc., there are even
> no language codes for them.

There is no need for hacking language-specific behavior into the fonts.
This is no different than using a Chinese-designed font to get best
display of unified Han characters for Chinese and a Japanese-designed font
to get best display of unified Han characters for Japanese.

>
> Maybe the final result is either we have to tolerant the dot on the above
> position or we have to create our own fonts?

Yes. As will users of any other language that expect multiple accents on
Latin letters to be positioned in special ways. This is no different from
the situation for Vietnamese, for example. Best display of Vietnamese
requires a font with ligatures defined for multiple accent combinations.

>
> If we have to create our own fonts, what are the criteria in order to
> conform to Unicode standard? (Available fonts currently are all defined by
> their own special codes.)

Simply use existing font standards and supply Unicode CMaps and ligation
tables.

>
> 3. Does "Bopomofo Extended" (U+31A0~U+31B7) violate the principle:
> characters should be used?
>
> These symbols almost never used since they were invented by some scholars.
> Most people use the latin based Taiwanese, and most data are written in
> latin based Taiwanese.

There are publications containing them, examples of which were presented
in the supporting documentation supplied by TCA to WG2. WG2 accepted
that documentation and encoded the characters in 10646.

> Why can "Bopomofo Extended" be accepted by Unicode Standard? It seems not
> very reasonable. It seems also unfair to those lesser-used languages whose
> symbols are not defined by government organizations, doesn't it?
>

WG2 and the UTC are not in the business of mediating between competing
groups of people who are pressing for alternate orthographies (and
who may or may not have associated political agendas). The Unicode
Standard has encoded all the characters required for accurate representation
of Minnanhua (or Hakka, for that matter) using Latin-based orthographies,
Han characters, or extended Bopomofo. At this point people in Taiwan
can get on with their implementations and argue among themselves
regarding which orthography or orthographies they wish to support,
teach, and/or standardize for those languages.

--Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:07 EDT