Re: Internal Representation of Unicode

From: Rick McGowan (rick@unicode.org)
Date: Fri Sep 26 2003 - 12:05:10 EDT

Next message: Peter Kirk: "Re: Fun with proof by analogy, was Re: Mojibake on my Web pages"

Previous message: Elaine Keown: "re: History of Unicoding Hebrew"
Maybe in reply to: myrkraverk@users.sourceforge.net: "Internal Representation of Unicode"
Next in thread: Jill Ramonsky: "RE: Internal Representation of Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

myrkraverk.......sourceforge.... wrote:

> In a plain text environment, there is often a need to encode more than
> just the plain character.
...
> Since I'm using 64 bits, I call it Excessive Memory Usage Encoding, or
> EMUE.
...
> I thought of dividing the 64 bit code space into 32 variably wide
> plains, one for control characters, one for latin characters, one for
> han characters, and so on;

This all seems to me like something of a pointless excercise. Or maybe
you're not making clear what is your intented audience of users and
problems that you're trying to solve.

Decent libraries exist that already do nice things with strings having
attributes. And that, in my opinion, is a better model than bit-hacking in
a 64-bit space with vague implementation-defined attributes that change
depending on the "script" of a character. Such "attributed strings" are
easy to work with and provide a much higher-level model than this.

You might want to check out Apple's Cocoa environment, particularly the
definitions of the attributed string classes. For example...
http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Java/Classes/NSAttributedString.html
or even the intro:
http://developer.apple.com/documentation/Cocoa/Conceptual/AttributedStrings/index.html

I'm sure there are libraries with similar capabilities for storing
characters + attributes in Java and other languages, I'm just not familiar
with them. Maybe some of the developers can chime in with their favorite
attributed string libraries. Even if you don't use one, you might find the
attributed string model educational.

(All of the above of course reflects only my personal opinion.)

Rick

Next message: Peter Kirk: "Re: Fun with proof by analogy, was Re: Mojibake on my Web pages"
Previous message: Elaine Keown: "re: History of Unicoding Hebrew"
Maybe in reply to: myrkraverk@users.sourceforge.net: "Internal Representation of Unicode"
Next in thread: Jill Ramonsky: "RE: Internal Representation of Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Sep 26 2003 - 12:46:55 EDT