Re: U+2018 is not RIGHT HIGH 6

From: Michael Probst <>
Date: Fri, 04 May 2012 18:34:09 +0200

Am Donnerstag, den 03.05.2012, 01:03 -0700 schrieb Asmus Freytag:
> Sometimes you are not free to choose what you would like.

Certainly not :-)

> One thing that's off the table is a new character code.
> The reason for that categorical statement is that there is too much
> data and software out that uses the existing character codes.

Like "Hey, we've been using ASCII for years now, we cannot just go and
create a whole new encoding." -- only for one or two characters?

> Throwing a new character into the mix will just create confusion.

The "mix" seems confusing already, throwing them in might subtract more
confusion that it adds.

> Text that should be identical would acquire two alternative
> representations depending on whether the new or the old character is
> used. That's not good.

Is that different renderings depending on which font is chosen? or
different "bytes" for the same meaning, like using U+002D, U+2013 and U
+2212 when meaning 'minus' in one text?

I do not understand this. It does not sound good, but it does neither
sound like something that has been prevented yet nor something that
would be made relevantly worse by adding some missed characters.

> Especially not for a situation that, while not ideal, has been
> tolerated by tens of millions of users for decades - which means it's
> not one of life's most urgent crises.

It can also be not good not to improve situations that are not ideal
just because they have already been tolerated.

> Sometimes, even when you are creating a "new" character encoding, […]

I suspected something like that, but it (only) explains the situation
and does not argue against improvement.

(The "only" is not to mean that I do not appreciate your effort of

> In any case, trying to approach this from the semantic position has
> issues. In Swedish, for example, you use the same quotation mark
> symbol for both opening and closing. It would be more than bizarre to
> use two different characters for that purpose.

If a grapheme conveys two different meanings ('quote on', 'quote off')
and one intends to encode meaning, ending up with two characters which
might look the same does not appear bizarre to me; especially not if one
considers the application of some COMBINING OPEN and CLOSE ...

> So that defines the characters a bit more by appearance than
> semantics.

In contrast to the detailed disunification of the dashes – and mirroring
( into ) just to maintain the meaning …

> On the other hand, you are pointing out that some uses allow a wider
> range of glyphic variation of the existing characters than other
> usages.

Am I? Seems I don't know what I'm doing.

> This is something that should be documented, but in terms of helping
> font designers provide the correct glyphs for each context.

Would that be something answering this question?

> The time to create a special character code for German quotation marks
> is passed.

It's *not solely* about German.

> The moment for that would have been the late 70s.

It wasn't too late for the various dashes until much later. It wasn't
too late to abandon EBCDIC. It wasn't too late to extend ASCII into ANSI
and ISO and various other encodings, and not too late to then abandon
all but ISO-8859-1 for unicode.

Received on Fri May 04 2012 - 11:38:10 CDT

This archive was generated by hypermail 2.2.0 : Fri May 04 2012 - 11:38:11 CDT