Re: Furigana

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Aug 12 2002 - 17:37:12 EDT


Michael asked:

> At 12:11 -0700 2002-08-08, Kenneth Whistler wrote:
>
> >Ah, but read the caveats carefully. The Unicode interlinear
> >annotation characters are *not* intended for interchange, unlike
> >the HTML4 <ruby> tag. See TUS 3.0, p. 326. They are, essentially,
> >internal-use anchor points.
>
> What does this mean? That if I have a text all nice and marked up
> with furigana in Quark I can't export it to Word and reimport it in
> InDesign and expect my nice marked up text to still be marked up?

Yes, among other things.

>
> Surely all Unicode/10646 characters are expected to be preserved in
> interchange. What have I got wrong, Ken?

Your expectation that this stuff will actually work that way.

Yes, the characters will be preserved in interchange. But the
most likely result you will get is:

<anchor1>text<anchor2>annotation<anchor3>

where the anchors will just be blorts. You should not expect that
the whole annotation *framework* will be implemented, and certainly
not that these three characters will suffice for "nice[ly] marked up...
furigana".

These animals are more like U+FFFC -- they are internal anchors
that should not be exported, as there is no general expectation
that once exported to plain text, a receiver will have sufficient
context for making sense of them in the way the originator was
dealing with them internally.

By rights, this whole problem of synchronizing the internal anchor
points for various ruby schemes should have been handled by
noncharacters -- but that mechanism was not really understood
and expanded sufficiently until after the interlinear annotation
characters were standardized.

--Ken



This archive was generated by hypermail 2.1.2 : Mon Aug 12 2002 - 15:50:16 EDT