Re: Unicode Public Review Issues update

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Oct 06 2003 - 12:58:20 CST


At 10:29 AM 10/6/03 +0530, jon@spin.ie wrote:
> > The Unicode Technical Committee has posted some new issues for public
> > review and comment. Details are on the following web page:
> >
> > http://www.unicode.org/review/
>
>A question about the issues already open: What is the justification for
>proposing to make Braille Lo?

Among other things it would make it part of identifiers. However, there's
been some suggestion that this is a bad idea. Whether or not a braille
symbol actually stands for a letter or a digit or a punctuation mark is
entirely dependent on a higher level protocol.

Also, by making them Lo, any parser that tries to collect words, would run
them together with any surrounding regular letters and digits. That seems
odd, but perhaps its not any more odd than mixing Devanagari and Han.

We've given Braille a script ID, since it's used for running text, unlike a
string of symbols.

There was a lot of discussion in the meeting which is the reason why UTC is
asking for public input before deciding.

The original model for these was that your text processing is done in
non-Braille, and on the last leg to a device, you would transcode the
regular text to a Braille sequence using a domain and language specific
mapping. Having the codes in Unicode allows you to preserve 'final form'
and transmit that as needed w/o having to also transmit the text-to-braille
mapping(s) that were used to generate the Braille version of the text.
(This assumes that the eventual human reader can do 'autodetection'.)

Needless to say, conceived this way, Braille does not fit neatly into
Unicode's text handling model. The General Category, being very simplistic,
can only express a single aspect of a characters use. Usually we can agree
on what that primary aspect is, so gc is reasonably useful as a quick cut.
However, Braille is a bit resistant if put to the question: Are you symbol
or letter?

In reality, the Braille codes are glyph codes. We decided at some point not
to allow any new types of gc values. If we didn't have that restriction, we
could assign them an *Sb or *Lb (for *Symbol-Braille or *Letter-Braille).
But that's an option we don't have.

One thing that we are hoping to learn is whether people are actually using
these Braille codes and are using them in ways that are or are not
compatible with the model we describe in
http://www.unicode.org/versions/Unicode4.0.0/ch14.pdf (see section 14.9).
In terms of the organization of the book we've clearly sorted Braille among
the symbols, by the way.

Any comments?

A./



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST