Stateful encoding mechanisms

From: Dean Snyder (dean.snyder@jhu.edu)
Date: Wed May 18 2005 - 23:15:33 CDT

Next message: Dean Snyder: "Re: ASCII and Unicode lifespan"

Previous message: JFC (Jefsey) Morfin: "Re: ASCII and Unicode lifespan"
In reply to: Alexander Kh.: "Re: ASCII and Unicode lifespan"
Maybe reply: Peter Constable: "RE: Stateful encoding mechanisms"
Maybe reply: Dean Snyder: "Re: Stateful encoding mechanisms"
Maybe reply: Peter Constable: "RE: Stateful encoding mechanisms"
Maybe reply: Kenneth Whistler: "Re: Stateful encoding mechanisms"
Maybe reply: Kenneth Whistler: "Re: Stateful encoding mechanisms"
Maybe reply: Alexander Kh.: "Re: Stateful encoding mechanisms"
Maybe reply: Peter Constable: "RE: Stateful encoding mechanisms"
Maybe reply: Philippe VERDY: "Re: Stateful encoding mechanisms"
Maybe reply: Dean Snyder: "Re: Stateful encoding mechanisms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Alexander Kh. wrote at 7:24 PM on Wednesday, May 18, 2005:

>That's Microsoft scale gigantism. I can think of many ways to restrict
>use of Unicode to only non-critical cases where the accuracy of data is
>of no importance. For example: by using a modified UTF-8 format where
>a ASCII letter can be used as a switch selector between any local
>encodings - that method will allow to save A LOT of space for commonly
>used characters.
>
>I think that by biulding extentions to UTF-8, such as a state-machine
>system, and using small but well-thought encoding tables and fonts one
>can totally avoid using Unicode, which is sloppy, inaccurate, incomplete
>and for some strange reason uses character '\0' within a string. This is
>not to mention its endianness problem. ...

Stateful mechanisms for plain text encoding are bad if for no other
reason than fragment fragility. Unfortunately Unicode does contain some
state-machine characters, which I think are mistakes - enabling, as they
do, fragment ambiguity or non-interpretability.

Here are some:

Stateful mechanisms that contribute to fragility at the character level -
Surrogates
BOM

Stateful mechanisms that contribute to fragility above the character level -
Bidirectional Ordering Controls
Annotation characters

Are there other stateful mechanisms in Unicode?

Respectfully,

Dean A. Snyder

Assistant Research Scholar
Manager, Digital Hammurabi Project
Computer Science Department
Whiting School of Engineering
218C New Engineering Building
3400 North Charles Street
Johns Hopkins University
Baltimore, Maryland, USA 21218

office: 410 516-6850
cell: 717 817-4897
www.jhu.edu/digitalhammurabi/
http://users.adelphia.net/~deansnyder/

Next message: Dean Snyder: "Re: ASCII and Unicode lifespan"
Previous message: JFC (Jefsey) Morfin: "Re: ASCII and Unicode lifespan"
In reply to: Alexander Kh.: "Re: ASCII and Unicode lifespan"
Maybe reply: Peter Constable: "RE: Stateful encoding mechanisms"
Maybe reply: Dean Snyder: "Re: Stateful encoding mechanisms"
Maybe reply: Peter Constable: "RE: Stateful encoding mechanisms"
Maybe reply: Kenneth Whistler: "Re: Stateful encoding mechanisms"
Maybe reply: Kenneth Whistler: "Re: Stateful encoding mechanisms"
Maybe reply: Alexander Kh.: "Re: Stateful encoding mechanisms"
Maybe reply: Peter Constable: "RE: Stateful encoding mechanisms"
Maybe reply: Philippe VERDY: "Re: Stateful encoding mechanisms"
Maybe reply: Dean Snyder: "Re: Stateful encoding mechanisms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu May 19 2005 - 10:12:52 CDT