>F-hook should also be representable with U+0066 LATIN SMALL
LETTER F followed by (I'm guessing here) U+0321 COMBINING
PALATALIZED HOOK BELOW. [If I'm wrong, then that'd explain why
f-hook is separately encoded and doesn't have a canonical
decomposition.] I wonder if you'd get a different glyph from
the florin sign with some fonts by using this combination
instead of U+0192.
U+0192 doesn't have a decomposition, as you've noted, and so
U+0066 U+0321 *can not* be used to represent this.
>At any rate, to me the big argument in favor of adding a
"florin sign" character to the standard would be if the glyph
shapes for these two characters aren't always identical. Is
that true, and if so, how do they differ?
There is another possible argument: differing semantics. U+0192
has general category Ll (lowercase letter) with a(n
informative) case mapping to U+0191, and a bidi category of L
(strong LTR). In constrast, currency symbols generally have
general category Sc, no case mapping, and a bidi category of Et
(European number terminator, a "weak" category). If they remain
unified, then implementers would face problems with knowing
what to do about case mapping or bidi behaviour. (I'm guessing
that the current situation is that implementers generally
assume this is primarily used as florin and give it behaviour
accordingly, in spite of the defined properties.)
As has been noted, though, U+0192 has already been widely used
as the character for florin, and that in spite of its
semantics. The least impact would result for existing users and
data containing U+0192 cum florin by assigning a separate
character for the IAI biliabial f, but that would leave U+0192
(now only used as florin) with the wrong semantics for what
it's intended to be, and a less than ideal name. The presents
us with a quandry.
Options:
1) leave it as it is
pros: no cost to existing implementations and data
cons: inadequately addresses (future?) needs of speakers whose
language is written with IAI biliabial f, and presents a
problem to implementers that want to support both uses
2) disunify; U+0192 used for IAI bilabial f, and new character
assigned for florin
pros: achieves disunification, which better serves users and
implementers; permits each character having those semantics
that make most sense for the intended use of the character
cons: breaks existing implementations and data
3) disunify; U+0192 used for florin, and leave semantics of
U+0192 as is
pros: minimal cost to existing implementations (and data?);
achieves disunification, which better serves users and
implementers
cons: semantics of U+0192 would be misleading at best and more
likely ignored in many implementations (implementations that do
what most users would really want would not be Conformant)
4) disunify; U+0192 used for florin, but change semantics of
U+0192 to name = FLORIN SIGN, cat = Sc, bidi = Et
pros: achieves disunification, which better serves
implementers; permits each character having those semantics
that make most sense for the intended use of the character;
minimal cost to existing implementations and data (assuming
that U+0192 has usually been treated as florin)
cons: violates a fundamental rule of the standard that
normative semantics do not change in order to protect existing
implementations
Well, the shoe has to fall somewhere, and I can't imagine
option 1 remaining the case without increasing objection
(though the longer option 1 is operational, the harder it will
be to abandon it). It seems to me that option 4 serves the
interest of most, even if it requires bending a few rules.
(It's not as though those rules don't ever get bent.)
Peter
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT