Re: Unicode and Security

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Feb 04 2002 - 20:47:26 EST


Gaspar wrote:

> > The BIDI algorithm is not reversible, and could not be made reversible
> > without eliminating features that are important to the bidi community.
> > This was considered at the time the bidi algorithm was developed.
>
> Hold on there! You admit that unicode alrgorithm is *really*
> not reversable? I was just bluffing because I just saw that their
> is no reverse algorithm published in the standard!

Of course it isn't reversible. (echoing John Cowan)

The bidi algorithm is a set of steps for going from a logical
representation of text to a specification of the *actual* directionality
for rendering in lines.

But there are inherent ambiguities in trying to reverse the
process, to go from line-rendered text display to a logical
representation of text. In addition to John Cowan's example
of ambiguity caused by assumption of the default rendering
order, you could always introduce extraneous embedding levels
that would resolve the same, or you could have otherwise
undetectable differences that would result in the same measurement
and display of text, such as one em space versus a sequence of
two en spaces.

Gaspar continued:

> Can you imagine the implications of this? Imagine somone signing
> a digital unicode document. He is looking at his viewer but
> what he signs is the ___bitstream___. So you claim that this guy
> who might have no connection to software industry at all will be
> able to run an algorithm - that does not exist - in his head?

Reading and understanding the content of text is no guarantee
of being able to reverse a rendering process to intuit the
exact order of characters which was used to produce that
text -- ever. This is not merely a Unicode (and ISO 10646) issue,
but even crops up in the severely limited context of ASCII text
rendered with monowidth fonts. A trivial example of this can
be found in otherwise undetectable spaces at ends of lines, or
in ambiguities with regard to whether a particular spacing was
produced by tabulation or insertion of multiple spaces.

>
> > This thread is a waste of time.

I agree with Mark about that.

>
> If unicode bi-di algorithm was reversable none of this
> would happen.

Nonsense.

> Software developers, who are flash and blood
> people, would be able to do a clean room implementation of
> the algorithm and the reverse of it. The correctness of
> the software could be *automatically* checked by just
> reversing the view and checking it against the bitstream.

Think again.

>
> Instead of the automatic check no there are test cases
> and if there is a nasty bug the reply is, oh well, sorry
> for that, and plug in another fix and test case.
>
> I feel I saw this attitude before... Is it only me?

'fraid so.

By the way, I just checked www.yudit.org and noted that among
the future plans for Yudit are:

 "* Waiting for a standard that makes more sense than Unicode
    and jump ship."

with "that makes more sense" pointing to http://www.bytext.org/
Oh ho! I think the readers of this list who considered the "virtues"
of ByText would find that an interesting indication of judgement.

--Ken
 



This archive was generated by hypermail 2.1.2 : Mon Feb 04 2002 - 20:27:19 EST