Re: Shift-JIS encoded text (was: RE: Tags and future new technologies [...])

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Fri, 1 Jun 2012 23:16:14 +0200

2012/6/1 Doug Ewell <doug_at_ewellic.org>:
> Peter Constable <petercon at microsoft dot com> wrote:
>
>> The only requirement of Unicode was to provide a way to map Shift-JIS
>> encoded text involving emoji to Unicode / 10646 in a way that could be
>> round-tripped,
>
> This is the part that has always confused me. At what point does text
> encoded in a vendor's private-use extension to Shift-JIS become
> "Shift-JIS encoded text"? Because I know for sure that I'm not supposed
> to refer to characters assigned to the Unicode PUA, my own or anyone
> else's, as being "encoded in Unicode."

May be because, without admitting it publicly, those symbols really
have a much wider use than in these private Shift-JIs extensions.

In which case, the need for roundtrip compatibility is definitely not
the main reason for their encoding, and these symbols should be
considered more globally (as they are certainly needed in other
countries or for other private implementations, but without the
interoperability that one could expect between these implementations
when they obviously mean the same thing and play the same role in
texts including them).

The private extension is just a sign that it was needed. The pressure
to include them in standard Shift-JIS is another sign, and then the
need to map them as well into the UCS, via their standardization in
Shift-JIS, whever it succeeds or not in that standard).

Of course, encoding flags visually in an international standard is
much more difficult, if one wants to encode some flags and not some
others, also because of political issues. That's why I propose another
way to represent them. This won't affect the private-use Shift-JIS
encoding, which can now have a roundtrip compatibility with its
existing symbols, even if the standard Shift-JIS will now prefer using
the more generic symbols instead of integrating the private-use
extension.
Received on Fri Jun 01 2012 - 16:18:47 CDT

This archive was generated by hypermail 2.2.0 : Fri Jun 01 2012 - 16:18:47 CDT