Re: Is there a UTF that allows ISO 8859-1 (latin-1)?

From: Dan Oscarsson (Dan.Oscarsson@trab.se)
Date: Mon Aug 24 1998 - 06:10:26 EDT


From: Jungshik Shin

>> To be able to allow other code values from UCS than the first 256, I need a
way
>> to add those without making all software I have to day obsolete and the new
>> software must be able to read all existing texts.
>> UTF-8 will not work unless it can read and write files compatible with what
>> I have today.
>> You who use non-latin character will also need something to mix old and new,
>> but your character sets
>> are not true subsets of UCS and cannot be handled as easily as ISO 8859-1.
>
> What do you mean by 'true' subset? KS C 5601/KS X 1001, JIS X 0208,
>JIS X 0212, GB 2312, Plane 1 of CNS 11036? (used in various CJK
>encodings such as EUC-JP, Shift_JIS, EUC-KR, EUC-CN, EUC-TW) are all
>subsets of UCS-2.

Se my response to Chris Wendt's mail.
By true subset I mean that both characters and code values must match.
Then only ascii and iso 8859-1 (and Unicode) are true subsets of UCS.

>
> If the only important character set to you is UCS, the best thing to
>do is get,write and encourage/ask other to write programs that work
>with UCS and its encodings, IMHO.
>

I am doing that, but often it results in the program able to read/write
ascii, UTF-8, UCS-2 or UCS-4 and not iso 8859-1 which is, like ascii, UCS-2
and UCS-4 a directely coded subset of UCS.

As long as we cannot get programmers to understand that their new
programs must still be able to read and write the text currently in use,
it will be very difficult to use the programs.

   Dan



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT