Re: Encoding of old compatibility characters

From: Frédéric Grosshans <frederic.grosshans_at_gmail.com>
Date: Mon, 27 Mar 2017 23:46:34 +0200

An example of a legacy character successfully encoded recently is ⏨
U+23E8 DECIMAL EXPONENT SYMBOL, encoded in Unicode 5.2.
It came from the Soviet standard GOST 10859-64 and the German standard
ALCOR. And was proposed by Leo Broukhis in this proposal
http://www.unicode.org/L2/L2008/08030r-subscript10.pdf . It follows a
discussion on this mailing list here
http://www.unicode.org/mail-arch/unicode-ml/y2008-m01/0123.html, where
Ken Whistler was already sceptical about the usefulness of this encoding.

Le 27/03/2017 à 16:44, Charlotte Buff a écrit :
> I’ve recently developed an interest in old legacy text encodings and
> noticed that there are various characters in several sets that don’t
> have a Unicode equivalent. I had already started research into these
> encodings to eventually prepare a proposal until I realised I should
> probably ask on the mailing list first whether it is likely the UTC
> will be interested in those characters before I waste my time on a
> project that won’t achieve anything in the end.
>
> The character sets in question are ATASCII, PETSCII, the ZX80 set, the
> Atari ST set, and the TI calculator sets. So far I’ve only analyzed
> the ZX80 set in great detail, revealing 32 characters not in the UCS.
> Most characters are pseudo-graphics, simple pictographs or inverted
> variants of other characters.
>
> Now, one of Unicode’s declared goals is to enable round-trip
> compatibility with legacy encodings. We’ve accumulated a lot of weird
> stuff over the years in the pursuit of this goal. So it would be
> natural to assume that the unencoded characters from the mentioned
> sets would also be eligible for inclusion in the UCS. On the other
> hand, those encodings are for the most part older than Unicode and so
> far there seems to have been little interest in them from the UTC or
> WG2, or any of their contributors. Something tells me that if these
> character sets were important enough to consider for inclusion, they
> would have been encoded a long time ago along with all the other stuff
> in Block Elements, Box Drawings, Miscellaneous Symbols etc.
>
> Obviously the character sets in question don’t receive much use
> nowadays (and some weren’t even that relevant in their time, either),
> which leads to me wonder whether further putting work into this
> proposal would be worth it.
Received on Mon Mar 27 2017 - 16:46:56 CDT

This archive was generated by hypermail 2.2.0 : Mon Mar 27 2017 - 16:46:56 CDT