Re: The Cent & Florin Signs VS. C-Slash & Left-Tailed F

From: Peter Constable (peter_constable@sil.org)
Date: Wed Jan 19 2000 - 15:46:03 EST


>F-hook should also be representable with U+0066 LATIN SMALL
       LETTER F followed by (I'm guessing here) U+0321 COMBINING
       PALATALIZED HOOK BELOW. [If I'm wrong, then that'd explain why
       f-hook is separately encoded and doesn't have a canonical
       decomposition.] I wonder if you'd get a different glyph from
       the florin sign with some fonts by using this combination
       instead of U+0192.

       U+0192 doesn't have a decomposition, as you've noted, and so
       U+0066 U+0321 *can not* be used to represent this.

>At any rate, to me the big argument in favor of adding a
       "florin sign" character to the standard would be if the glyph
       shapes for these two characters aren't always identical. Is
       that true, and if so, how do they differ?

       There is another possible argument: differing semantics. U+0192
       has general category Ll (lowercase letter) with a(n
       informative) case mapping to U+0191, and a bidi category of L
       (strong LTR). In constrast, currency symbols generally have
       general category Sc, no case mapping, and a bidi category of Et
       (European number terminator, a "weak" category). If they remain
       unified, then implementers would face problems with knowing
       what to do about case mapping or bidi behaviour. (I'm guessing
       that the current situation is that implementers generally
       assume this is primarily used as florin and give it behaviour
       accordingly, in spite of the defined properties.)

       As has been noted, though, U+0192 has already been widely used
       as the character for florin, and that in spite of its
       semantics. The least impact would result for existing users and
       data containing U+0192 cum florin by assigning a separate
       character for the IAI biliabial f, but that would leave U+0192
       (now only used as florin) with the wrong semantics for what
       it's intended to be, and a less than ideal name. The presents
       us with a quandry.

       Options:

       1) leave it as it is

       pros: no cost to existing implementations and data
       cons: inadequately addresses (future?) needs of speakers whose
       language is written with IAI biliabial f, and presents a
       problem to implementers that want to support both uses

       2) disunify; U+0192 used for IAI bilabial f, and new character
       assigned for florin

       pros: achieves disunification, which better serves users and
       implementers; permits each character having those semantics
       that make most sense for the intended use of the character
       cons: breaks existing implementations and data

       3) disunify; U+0192 used for florin, and leave semantics of
       U+0192 as is

       pros: minimal cost to existing implementations (and data?);
       achieves disunification, which better serves users and
       implementers
       cons: semantics of U+0192 would be misleading at best and more
       likely ignored in many implementations (implementations that do
       what most users would really want would not be Conformant)

       4) disunify; U+0192 used for florin, but change semantics of
       U+0192 to name = FLORIN SIGN, cat = Sc, bidi = Et

       pros: achieves disunification, which better serves
       implementers; permits each character having those semantics
       that make most sense for the intended use of the character;
       minimal cost to existing implementations and data (assuming
       that U+0192 has usually been treated as florin)
       cons: violates a fundamental rule of the standard that
       normative semantics do not change in order to protect existing
       implementations

       Well, the shoe has to fall somewhere, and I can't imagine
       option 1 remaining the case without increasing objection
       (though the longer option 1 is operational, the harder it will
       be to abandon it). It seems to me that option 4 serves the
       interest of most, even if it requires bending a few rules.
       (It's not as though those rules don't ever get bent.)

       Peter



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT