Re: S with comma/cedilla

From: Sairus P. Patel (sppatel@adobe.com)
Date: Fri Sep 25 1998 - 21:01:48 EDT


[ I've been away on sabbatical, hence my delayed participation in this
  thread. ]

1. Kenneth Whistler asks:

> U+1E9E is an unassigned (and reserved) code point--nobody should be
> using it for anything. Perhaps Adobe should clarify what it is doing on
> this.

   Adobe Glyph List (AGL) 1.1 uses U+1E9E since that was the proposed
   Unicode value (UV) for LATIN CAPITAL LETTER S WITH COMMAACCENT at the
   time that glyph list was created. The proposed UV for this character
   has since changed to U+0218; we've been aware of this for a while but
   have been in the process of updating our glyph naming conventions,
   which, along with this and other UV changes will be published on
   adobe.com in the next few weeks. (Hopefully the proposed assignment
   won't change again, though this, of course, is the risk of using UVs in
   the pipeline.)

2. Relevant AGL updates:

   Due to the proposed addition of the S/s and T/t comma accent characters
   to Unicode, AGL will be updated as follows (note that a glyph name can
   be mapped to more than one UV):

     -------- ---------------------- ---------------- --------------
     UV Char name AGL 1.1 AGL 1.2
              (shortened) glyph name glyph name
     -------- ---------------------- ---------------- --------------
     015E/F S/s with cedilla S/scommaaccent S/scedilla
     0162/3 T/t with cedilla T/tcommaaccent T/tcommaaccent (a)
     0218/9 S/s with comma below - S/scommaaccent
     021A/B T/t with comma below - T/tcommaaccent (a)
     1E9E/F S/s with comma below S/scedilla - (b)
     F6C1/2 S/s with cedilla S/scedilla S/scedilla (c)
     -------- ---------------------- ---------------- --------------

     (a) In AGL 1.2, glyph names T/tcommaaccent are each mapped to two
         UVs. (When the Adobe Central European character set was
         specified, the need wasn't seen to have T/tcedilla glyphs.)

     (b) In AGL 1.2, nothing will be mapped to U+1E9E/F.

     (c) Glyph names S/scedilla will continue to be mapped to these
         Corporate Use Subarea (CUS) UVs, even though the need have
         Corporate Use Subarea assignments for them has been eliminated
         due to the proposed characters.

   I hope the above new assignments more closely align with the Unicode
   Standard's intentions.

3. Problems with decomposition

   AGL will continue to map "commaaccent" glyph names to U+0122, LATIN
   CAPITAL LETTER G WITH CEDILLA, and the K, L, N, and R versions,
   including their lowercase.

   The main problem I see is that Unicode's official decomposition for these
   indicates the cedilla. For example, the decomposition for
     U+0122 LATIN CAPITAL LETTER G WITH CEDILLA
   is
     U+0047 LATIN CAPITAL LETTER G
   followed by
     U+0327 COMBINING CEDILLA

   This could yield unexpected results in glyph composition and
   decomposition in Unicode-aware software.

Sairus Patel
Type Core Technology



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT