RE: ASCII as a subset of Unicode

From: Jonathan Rosenne (jr@qsm.co.il)
Date: Sun Apr 12 2009 - 12:47:32 CDT

Next message: Hans Aberg: "Re: ASCII as a subset of Unicode (was: Re: Oxford proposes a leaner alphabet)"

Previous message: Doug Ewell: "Re: ASCII as a subset of Unicode (was: Re: Oxford proposes a leaner alphabet)"
In reply to: Jim Allan: "Re: ASCII as a subset of Unicode"
Next in thread: Curtis Clark: "Re: ASCII as a subset of Unicode"
Reply: Curtis Clark: "Re: ASCII as a subset of Unicode"
Maybe reply: William J Poser: "RE: ASCII as a subset of Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Transmission errors apply to transmission, not to storage. ASCII was originally designed for 7 bit transmission. Some implementations added parity (odd or even) and others did not. You can still configure a comm port (if you have one) on a PC for 7 bit, no parity transmission and if you use ASCII codes (i.e. just the lower 128 codes) it works fine.

Before ASCII, the norm was to use 6 bit codes for textual data, i.e. no lower case.

Best regards,

Jony Rosenne

From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Jim Allan
Sent: Saturday, April 11, 2009 9:49 PM
To: 'Unicode Mailing List'
Subject: Re: ASCII as a subset of Unicode

Jukka K. Korpela wrote:

I'm not sure what you mean by "such" here, but in fact, even in the 1980s and early 1990s, DECsystem-10 and -20 (PDP-10 and -20) used a word length of 36 bits, packing five 7-bit ASCII characters in one word (and using the spare bit for special purposes).

ASCII was surely designed to allow implementations where 7 bits are used for one character. Don't confuse this with the current situation where such implementations are obsolete and "everyone" uses at least 8 bits for a character, even when working with ASCII only.

>From http://www.neurophys.wisc.edu/comp/docs/ascii/ :

"It was therefore decided to use 7 bits to store the new ASCII code, with the eighth bit being used as a parity bit to detect transmission errors."

>From http://czyborra.com/charsets/iso646.html :

"ASCII uses only 7 bits and allows the most significant eighth bit to be used as parity bit, highlight bit, end-of-string bit (all of which are considered bad practice nowadays) or to include additional characters for internationalization <http://czyborra.com/charsets/iso8859.html> (i18n for which we need 8bit-clean programs that do none of afore-mentioned silly tricks) but ASCII defined no standard <http://czyborra.com/charsets/iso8859.html> for this and many manufacturers invented their own proprietary codepages <http://czyborra.com/charsets/codepages.html> ."

For an original ASCII definition see <http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-6,%201st%20Edition,%20March%201963.pdf> <http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-6,%201st%20Edition,%20March%201963.pdf > :

"This character set is the first of a family of sets. Higher-order sets will enlarge the repertoire for both 'graphics' and 'controls'."

Though not defined in the original standard, from the beginning it was understood that ASCII would normally run in an 8-bit environment and that the 8th bit could be used to define additional characters, and that the standards committee expected to define higher-order sets.

Jim Allan

Next message: Hans Aberg: "Re: ASCII as a subset of Unicode (was: Re: Oxford proposes a leaner alphabet)"
Previous message: Doug Ewell: "Re: ASCII as a subset of Unicode (was: Re: Oxford proposes a leaner alphabet)"
In reply to: Jim Allan: "Re: ASCII as a subset of Unicode"
Next in thread: Curtis Clark: "Re: ASCII as a subset of Unicode"
Reply: Curtis Clark: "Re: ASCII as a subset of Unicode"
Maybe reply: William J Poser: "RE: ASCII as a subset of Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sun Apr 12 2009 - 12:50:01 CDT