From: Asmus Freytag (firstname.lastname@example.org)
Date: Wed Jun 14 2006 - 18:22:43 CDT
On 6/14/2006 1:56 PM, Richard Wordingham wrote:
> Philippe Verdy wrote on Wednesday, June 14, 2006 at 12:56 PM
>> Regarding the i-shaped "Haken" phonetic diacritics included in the
>> PDF (for the "hline Offen" and "überoffen" vowel qualifiers), I see
>> them like simple or double dotless i subscripts (their form are very
>> similar to the form of the small i letter under which they are drawn,
>> except that they just lack the top arm, but the resolution of the
>> bitmap is iunsufficient to really decide) which may merit encoding...
> Are we attempting an exact reproduction of the glyphs, or are we
> looking for the correct encoding of texts?
What we are looking for is to correctly reflect the text. If there are
different *conventions* in writing down a concept, it's not correct to
say "oh they mean the same thing, give them the same code point".
However, if there are different visual style for the same symbol, then
we do unify.
An example for the latter is the use of inclined vs. upright integral
signs. The two are the same symbol (integral sign), so the style is
relegated to the font.
In regular expression syntax, you can find both ^ and ~ used to negate a
character class, as in
[~a] or [^a] (anything, but 'a'). These are two different conventions
for the same concept, but they are using two different symbols. It's not
correct in that case to unify ~ and ^ into a single character.
I suspect that for phonetics, sometimes there's a common symbol with
different typographical style, and sometimes there is the use of a
different symbol. I'm not knowledgeable enough in that discipline to
help decide the particular question, but in listening to arguments pro
or con it helps me when the proponents are aware of the distinction I've
drawn above and can directly address it.
When there's a doubt whether its two styles of the same symbol or two
symbols used for the same concept, Unicode has often preferred to err on
the side of allowing separate code points for dissimilar looking
symbols. This allows for the possibility that something that looks
different can be assigned different meanings in some other notation or
I'm not sure, whether in this context, I find the following
argumentation ultimately compelling:
> The hooks have the semantic of U+031C COMBINING LEFT HALF RING BELOW,
> i.e. more open pronunciation. One needs a very good reason to encode
> them as anything else. In particular, you need to be sure that they
> are not simple 'squiggle below'. The diacritic for openness is very
> variable - in Yoruba it can be a vertical line, or even, through the
> absence of a background in phonetics, a mere dot.
It does not address the question of whether these differences are more
than font styles, but reflect different notational conventions.
There are a couple of cases where in mathematics, continental European
notations actually use different symbols from American style. (And
usages also shift over time).
This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 18:28:23 CDT