Re: Combining Triple Diacritics (N3915) not accepted by UTC #125

From: Benjamin M Scarborough (benjamin.scarborough@student.utdallas.edu)
Date: Sat Nov 13 2010 - 03:19:55 CST

Next message: Jim Monty: "Application that displays CJK text in Normalization Form D"

Previous message: William_J_G Overington: "Re: Tag Characters (from Re: Fwd: RFC 6082 on Deprecating Unicode Language Tag Characters: RFC 2482 is Historic)"
Maybe in reply to: Karl Pentzlin: "Combining Triple Diacritics (N3915) not accepted by UTC #125"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I believe that the key to getting these characters encoded is
establishing that there is a vital semantic importance to the character
that is lost if it is stripped away. This is the grounds for the
Mathematical Alphanumeric Symbols block.

Unfortunately, figures 1 and 2 from JTC1/SC2/WG2 N3915 actually provide
a reason -against- encoding. The meaning of the diacritic in these two
examples is that the transliterated letters were ligated in the
original text. In this usage, the mark can span any arbitrary number of
letters; indeed, figure 2 shows the mark in question spanning four
letters. This makes it a much better candidate for use in higher-level
markup than a set of combining marks.

Figures 3 and 4 present a better case and show a stronger need for some
combining triple diacritic. I notice that all seven examples between
the two figures represent what would normally be two letters with a
double diacritic, but some modifier symbol intervenes and stretches the
tie to span three. However, proposing the triple diacritics used this
way would require proof that the sequence of letters with the diacritic
has some important difference from the same sequence of letters
without, which N3915 fails to establish.

In any event, I happen to know that there is in some phonetic
transcription system an "sch" with breve below. It is used to represent
[ʒ], which contrasts with the unmarked sch used to represent [ʃ]. This
is a clear semantic distinction, and so the sch with breve below should
be encoded in some fashion, either as a sequence of characters or some
fully composed one.

--Ben Scarborough

Next message: Jim Monty: "Application that displays CJK text in Normalization Form D"
Previous message: William_J_G Overington: "Re: Tag Characters (from Re: Fwd: RFC 6082 on Deprecating Unicode Language Tag Characters: RFC 2482 is Historic)"
Maybe in reply to: Karl Pentzlin: "Combining Triple Diacritics (N3915) not accepted by UTC #125"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Nov 13 2010 - 03:27:29 CST