Re: Codepoint Differentiation

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Tue Feb 22 2005 - 02:07:26 CST

Next message: Gregg Reynolds: "Re: [idn] IDN spoofing"

Previous message: Erik van der Poel: "nameprep, IDN spoofing and the registries"
In reply to: UList@dfa-mail.com: "Re: Codepoint Differentiation"
Next in thread: UList@dfa-mail.com: "Re: Codepoint Differentiation"
Reply: UList@dfa-mail.com: "Re: Codepoint Differentiation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 04:44 PM 2/21/2005, UList@dfa-mail.com wrote:
>Hello,
>
>
>1. Then it sound like:
>
> - Serbian Cyrillic Small "t"

This should be handled by language dependent glyph selection.
That's a standard feature in OpenType and there's no need to
duplicate that facility in the encoding.

(Unless I misunderstand this example).

> - Coptic letterforms for Greek letter codepoints

Already being encoded in 4.1 in a new "Coptic" block. The unification of
these has been considered a mistake - it took a while to rectify as we
needed to research what precisely the Coptic repertoire should be.

>- complete Archaic Greek and Asia Minor scripts aligned to Greek letter
>codepoints

Rather than messing with variation selectors, this is best handled by using
fonts that are specific to archaic use.

Where it's a question of a a different script - be patient, it's probably
slated to be encoded.

It's a common problem that archaic scripts use different shapes at
different times for the same characters. Sometimes, the answer may be that
it's really two different scripts, in which case the precursor can be coded
separately. Sometimes, it's reasonable to ask users to use a different font
for a given period. Sometimes, a specific higher level protocol should be
developed to handle specific problems of scholarly representation of text.

As a last resort, variation selectors might be used in some instances - but
not as a blanket approach.

>are exactly what the Variation Selectors were designed for. There are no
>issues other than a smart font substituting an alternate glyph. They can
>default in a "low-fidelity" rendering to the primary codepoint glyphs. I would
>welcome individual codepoints for them, but Unicode has already decided
>otherwise. There is a clear need to be able to access the glyphs somehow, on a
>device which has only one general Unicode font installed.

As stated above (and as others have pointed out) your premise is incorrect
for many of your examples. Not everything that requires glyph substitution
should be encoded via variation selectors.

A./

>2. I don't know much about:
>
> - alternate CJK ideographs and syllabographs
>
>But I imagine some or all of them would fit the criteria as well.
>
>
>3. German Sharp S which I mention, is probably too complicated under the
>present Variation Selector definition, and will have to go down to my second
>category with combining marks.
>
>
>4. So, that leaves me a little mystified why a variation selector isn't
>already in use for the notorious Serbian "t". Seems a lot more practical than
>switching language identifiers every word in an HTML Russian-Serbian
>dictionary.
>
>
>5. As to functions with combining marks, much of my original post discusses
>the likely need for a new class of differentiating codepoints, other than
>Variation Selectors, to handle that. In some cases the CGJ (or ZWJ) might be
>usable, though I am already finding an essential problem with that for umlaut
>vs. diaeresis (which you are all just dying to hear about -- and which
>urgently needs to be solved).
>
>
>Thanks,
>Doug/ulist@dfa-mail.com
>
>
>
>
>Asmus Freytag wrote:
> >
> > > > Is there actually any problem with using Variation Selectors as-is to
> > > > differentiate ...
> >
> > Doug already answered about the fact that only standardized sequences
> are valid
> > and the only standardizer for sequences is the Unicode Consortium.
> >
> > Beyond that, variation selectors have another limitation: their only
> > function is to identify variants - and that means variants with different
> > GLYPH, not variants with different *behavior*.
> >
> > Variation selectors are designed to be ignorable for all processes that
> > don't deal in rendering, and, they are also ignorable for low-fidelity
> > rendering, i.e. rendering that does not support them (yes, I know, that's a
> > bit circular).
> >
> > For distinctions in *sorting behavior*, a Combining Grapheme Joiner can
> > often be used - but it is not intended to result in differences in display.
> >
> > The use of all of these special encoding crutches needs to be kept to a
> > minimum. We all know cases where using a variation selector is preferable
> > over adding a new character, since the differentiation is minute, not
> > universally applicable or both. However, most text processes have to be
> > designed to actively ignore them - and you have to be able to know, in
> > advance, for which process they can (and must) be ignored.
> >
> > That means, you cannot arbitrarily use existing mechanisms to make
> > distinctions that matter to algorithms that were designed to ignore these
> > mechanisms. Therefore, for variation selectors, any non-glyphic
> > distinctions are completely out of the picture.
> >
> > A./

Next message: Gregg Reynolds: "Re: [idn] IDN spoofing"
Previous message: Erik van der Poel: "nameprep, IDN spoofing and the registries"
In reply to: UList@dfa-mail.com: "Re: Codepoint Differentiation"
Next in thread: UList@dfa-mail.com: "Re: Codepoint Differentiation"
Reply: UList@dfa-mail.com: "Re: Codepoint Differentiation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Feb 22 2005 - 02:08:13 CST