Re: U+2018 is not RIGHT HIGH 6

From: Asmus Freytag <>
Date: Wed, 02 May 2012 11:52:38 -0700

On 5/2/2012 8:33 AM, Michael Probst wrote:
> Am Sonntag, den 29.04.2012, 23:43 -0700 schrieb Asmus Freytag:
>> Even if some minutiae of glyph selection are left to a font, the problem
>> is often that there's no specification as to what certain languages
>> need, so that fonts cannot be expected to provide the correct
>> implementation.
>> When Unicode was first created, the fact that one and the same quotation
>> mark character could be both opening and closing was not widely realized
>> in the character encoding community.
> Did (does?) that matter? Some hour ago inserting a Hebrew Aleph turned a
> closing (or right) round bracket ")" into "(".

Yes. Such "magic" can only happen if the context is rigidly defined,
here by presence of a specific character code. When it comes to
quotation marks, knowing when to use opening or closing marks does
matter (such as when a system takes straight quotes " and makes them
"smart quotes). Having the same character act in a number of different
ways makes such algorithms dependent on language.

Language information is much less reliably associated with a string than
its character codes. That's why "smart quote" algorithms are best
applied during editing, where humans are around to catch misapplication
and can disambiguate tricky situations. The mirroring of punctuation
marks, however, can be left to a "blind" algorithm at rendering time.

>> This was rectified over time, and
>> now there is detailed information (even though it may not be exhaustive)
>> on common practices in chapter 6 of the standard.
> On my way to it.
>> The document that was passed around here, is difficult to follow because
>> it mixes issues of glyph design with character selection and font
>> selection.
> Sorry.

Only natural. The problem doesn't automatically break itself down to
these layers, it takes understanding of the architecture, and there's a
learning process involved.
>> The discussion would have to be recast in terms of what
>> design features successful language-dependent glyphs would need to
>> exhibit for a combination of existing characters with certain languages.
> Something like this?
> Used on the left U+2018 (LEFT SINGLE Q…) and U+201C (LEFT DOUBLE Q…)
> should keep to the right and exhibit a visual character (appearance) of
> "leading in to" the text to their right in: Albanian, Arabic, Chinese,
> Danish, English, French (though NNBSP will follow on the right and
> guillemets be used most of the time instead), Hebrew, Irish, Italian,
> Catalonian, Korean, Dutch, Portuguese, Russian, Spanish, Thai and
> Turkish.
> Though in Arabic, Hebrew and other occasions of RTL-use that should be
> "lead out of", though I do not know whether such a difference could
> designed into them.
> Used on the right they should keep to the left and exhibit a visual
> character of "leading out of" the text to their left in: Bulgarian,
> German, Icelandic, Latvian, Lithuanian, Serbian, Slovakian, Slovenian,
> Sorbian, Czech, Ukrainian and Belarusian.
> (No claim to correctness or completeness.)

A set of pictures would make this more clear.

> Michael
Received on Wed May 02 2012 - 13:55:55 CDT

This archive was generated by hypermail 2.2.0 : Wed May 02 2012 - 13:55:56 CDT