Re: Representation of neutral tone in pinyin and bopomofo

From: Stephan Stiller <stephan.stiller_at_gmail.com>
Date: Fri, 22 Nov 2013 08:41:04 -0800

Hi Eric,

[We met at the UTC meeting.]

I.
> Is it correct that: in bopomofo, the neutral (or light) tone is
> represented by U+02D9 ˙ DOT ABOVE, and in the text representation,
> that character follows the bopomofo characters of the syllable (just
> like all the other characters for tones)

1. Given the document 國語注音符號手冊
> https://www.dropbox.com/s/8g73e3z4b0mc8vt/mandarin_zhuyinfuhou_handbook.pdf
which Bobby Tung (via Koji Ishii) told us about, a dot above (U+02D9)
wouldn't work. Basically, the placement of the neutral-tone tone mark in
bopomofo is centered on top of the syllable for vertical bopomofo and
/centered/ to the left of the entire syllable for horizontal bopomofo.

2. Note that bopomofo can occur without characters in text of
schoolchildren (as vertical bopomofo stacks in ordinary LTR-writing) or
interspersed (as vertical stacks if possible, but that's not absolute)
for the representation of Taiwanese character-less morphemes (or to
indicate that a reading in the Taiwanese language – as opposed to
Taiwanese Mandarin – is preferred).

3. Which character "should" be used is very much unclear, but MS's
bopomofo input produces
     ˊ (U+02CA)
     ˇ (U+02C7)
     ˋ (U+02CB)
     ˙ (U+02D9)
as I'm sure you've checked. To me it would seem intuitive to use
     · (U+00B7),
but the truth seems to be that it's underspecified. Basically, it's for
you/us to decide.

II.
> [Is it correct that:] in pinyin, the neutral tone is typically not
> marked, but it may be marked. When that's the case, U+02D9 ˙ DOT ABOVE
> is used.
1. No. In pinyin the neutral tone has traditionally never been marked
(read: absence of a tone mark means "neutral tone" unless it's clear
you're dealing with toneless pinyin), except I know that at least 现代汉
语词典 (Xiàndài Hànyǔ Cídiǎn) has been using an obligatory dot before
the syllable for all neutral tones. The reason seems to be to draw more
visual attention to them and to make possible a notation for an optional
neutral tone, like Charlie Ruland pointed out (but more on that below).

2. The way it is rendered in Xiàndài Hànyǔ Cídiǎn reveals a lack of
typographic skill. At least the 5th and the 6th (= the latest) editions
set it in a way that it seems visually a tad closer to the letter on the
left and with a space appearing a tad too wide on the right. Think
     zhuō · zi (zhuō␣·␣␣zi)
but less pronounced. In Charlie Ruland's link to GB/T 16159-2012 (汉语拼
音正词法基本规则)
     http://www.lshk.org/sites/default/files/zhengcifa_0.pdf ,
the spaces seem almost equal (I think they still aren't quite) but are
unseemly fat. Given that the dot belongs semantically to the syllable on
the right, this can't be the way it should be.

3. Now about the choice of scalar value. Like for bopomofo it seems
underspecified. As the context is not katakana (・ (U+30FB)) and the
character itself is semantically nothing like a bullet (• (U+2022))
(also, I'd expect it to be smaller than what you'd expect a bullet to
look like), I'd pick · (U+00B7). An earlier version of MS's pinyin input
method produced U+00B7 on the user inputting "@". I haven't checked many
input methods lately, but it should be very easy to find out. The reason
they have it is that a centered dot is used in Chinese to separate given
name and surname in certain non-Chinese name transcriptions. For
example, "Bill Gates" is
     比尔·盖茨 (bǐ'ěr gàicí),
strangely without a visual separator on the far left or far right.

4. Finally, some comments about what Charlie wrote:

> Rule 7.3 of GB/T 16159-2012
> <http://www.lshk.org/sites/default/files/zhengcifa_0.pdf> stipulates
> that a preceding dot (probably U+00B7 or U+2022) be used to indicate
> neutral tone in dictionaries, as had been common practice among many
> dictionary makers anyway.
4.1. This is not correct. The text states that dictionary-like materials
/may/ (可) mark a neutral tone as such, but the implication is that they
don't have to. The same document contains on all preceding pages only
unmarked neutral tones. This has always been the default, and I would
assume that it will remain so.

> When there is alternation between neutral and another tone two tone
> marks may be used simultaneously, as in /zhī·dào/ (知道).
4.2. Yes. But there's a problem with their notation. Xiàndài Hànyǔ
Cídiǎn claims that optional neutral tones are by default neutral tones
and sometimes full tones. It's indeed true that most syllables for which
there's such variation have a neutral tone in the majority. Their
example is wrong though: the majority pronunciation of 知道 is zhīdào
(with a fourth tone). The way they explain their notation, there is no
way to indicate an "optional neutral tone" whose default pronunciation
has a full tone. Also, as by-speaker free phoneme variation of this type
is rare (in languages in general), such notation would serve mainly to
save space in a dictionary; so I'd just list zhīdào and zhīdao
side-by-side as a lexicographer. Btw it's telling that in item 6.1.9.2
they have zhīdào and not zhīdao – the latter pronunciation has got to be
exceedingly rare, I don't recall hearing it.

III.
> When U+02D9 is used in pinyin, where it is in the character sequence?
> before the syllable to which it applies (where it is displayed) or
> after (like in bopomofo)?
1. I think that listing a neutral-tone dot after the syllable it applies
to would just introduce huge reordering headaches. Would you really want
to think about how this interacts with the apostrophe (actually: right
single quotation mark, if their (bad) typesetting practice is to be
imitated) and hyphenation? Think about contrasting sequences like
dōng·xi'ān /vs/ dōng·xiān. If you put the dot logically after the
syllable, it has to be after "xi" in the former example and after "xiān"
in the latter example. So you have to do backtracking, check for an
apostrophe, make sure nothing goes wrong in the interaction with
linebreaking (cf: item 6.6.2 in that document) – better to not have to
think about this sort of thing. But, yes, there's some trickery with
sorting to figure out (Wenlin, which I'm contributing to, has something
to say about this <http://wenlin.com/pysort>; perhaps ask Tom Bishop
directly for the latest info), but then we're lucky in that outside of
those dictionaries that explicitly choose to use a centered dot for a
neutral tone (Wenlin doesn't use such notation, for example), it is
essentially not used in such a way.

> When U+02D9 is used in bopomofo, it needs to be displayed before the
> syllable. Is the display position simply "before the nearest preceding
> character from the set {U+3105 ㄅ BOPOMOFO LETTER B ... U+3119 ㄙ
> BOPOMOFO LETTER S, U+31A0 ㆠ BOPOMOFO LETTER BU ... U+31A3 ㆣ BOPOMOFO
> LETTER GU}"?
2. That's answered by the document on 國語注音符號手冊. I would be
disinclined to put the dot last in logical order, but I assume you know
what you're doing. In any case, it's not "before the nearest preceding
[bopomofo] character" but before the first element of the syllable. Also
remember that
     ㄗㄚ (zā)
is different from
     ㄗ ㄚ" (zī ā).
I forgot whether there are other such ambiguities dependent on the
presence of a syllable break (which I assume ought to be indicated by a
space – and if not, there's a problem ...).

IV.
Btw, the Taiwanese-invented 通用拼音 (Tongyong Pinyin) ["pinyin"
normally means "Hanyu Pinyin"; now they use Hanyu Pinyin in TW too] used
a small circle on top of the syllable's core vowel.

Stephan
Received on Fri Nov 22 2013 - 10:44:30 CST

This archive was generated by hypermail 2.2.0 : Fri Nov 22 2013 - 10:44:33 CST