Re: UTF-8, C1 controls, and UNIX

From: Keld Jørn Simonsen (keld@dkuug.dk)
Date: Wed Feb 28 2001 - 17:51:30 EST

Next message: jarkko.hietaniemi@nokia.com: "RE: Latin digraph characters"
Previous message: Frank da Cruz: "Re: UTF-8, C1 controls, and UNIX"
Maybe in reply to: Frank da Cruz: "UTF-8, C1 controls, and UNIX"
Next in thread: Frank da Cruz: "Re: UTF-8, C1 controls, and UNIX"
Reply: Frank da Cruz: "Re: UTF-8, C1 controls, and UNIX"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Wed, Feb 28, 2001 at 01:11:20PM -0800, Frank da Cruz wrote:
> The idea behind UTF-8 is to be able to use it in non-Unicode-aware UNIX
> versions: It lets you have Unicode filenames, Unicode directory names,
> Unicode file contents, Unicode email, etc. But what it does not do is let
> you *type* Unicode into regular UNIX applications or shells, if the UTF-8
> happens to contain C1 control characters as do, for example, many of the
> Cyrillic letters (e.g. capital A through PE). Most UNIX terminal drivers
> treat incoming C1 controls like their C0 counterparts, so 0x83 == 0x03 ==
> Ctrl-C, which interrupts whatever process you are talking to. Similarly
> 0x84 == Ctrl-D, which is EOF; 0x88 is backspace, and so on.

Maybe one should make a transmission safe UTF that left C1 alone?

keld

Next message: jarkko.hietaniemi@nokia.com: "RE: Latin digraph characters"
Previous message: Frank da Cruz: "Re: UTF-8, C1 controls, and UNIX"
Maybe in reply to: Frank da Cruz: "UTF-8, C1 controls, and UNIX"
Next in thread: Frank da Cruz: "Re: UTF-8, C1 controls, and UNIX"
Reply: Frank da Cruz: "Re: UTF-8, C1 controls, and UNIX"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT