Re: Furigana

From: Tex Texin (tex@i18nguy.com)
Date: Tue Aug 13 2002 - 22:21:03 EDT


Thanks Ken. I don't know how I missed the text on 326 when I scanned it
before I mailed.
tex

Kenneth Whistler wrote:
>
> Tex asked:
>
> > But does the standard address their removal by receivers (or
> > intermediaries) , and does removing them include removing the contained
> > annotation?
>
> Yes and yes. p. 326:
>
> "On input, a plain text receiver should either preserve all characters
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> or remove the interlinear annotation characters as well as the annotating
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> text..."
> ^^^^
>
> >
> > I can imagine an application that doesn't support I.A. deciding the
> > annotation is out of band and can't be preserved in its plain text
> > output, and so justifiably strips it as well.
> > Does the standard say what to do with "for internal use" only
> > characters?
>
> Yes. Unicode 3.1:
>
> D7b: Noncharacter: a code point that is permanently reserved for
> internal use, and that should never be interchanged.
>
> C10: A process shall make no change in a valid coded character
> representation other than the possible replacement of
> character sequences by their canonical-equivalent sequences
> or the deletion of noncharacter code points, if that process
> purports not to modify the interpretation of that coded
> character sequence.
>
> The interlinear annotation characters fall in a gray zone, since
> they are not noncharacters, but by rights ought to have been.
> Since they are standard characters though, the standard has to
> provide some guidelines -- and it is simply safer, if you encounter
> and delete them, to also delete the annotation. You would be changing
> the interpretation of the text, but in a knowing, intended manner.
>
> >
> > I would have thought the rule was to ignore and pass along.
>
> In general, yes, as for everything else, including unassigned
> code points. If your role in life is as a database, for example,
> or some other kind of data source or data pipe, then minimal
> meddling with the bytes is safest. But other kinds of processes
> will do graduated manipulations, depending on what they are
> aiming for.
>
> --Ken

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
Xen Master                          http://www.i18nGuy.com
                         
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Tue Aug 13 2002 - 20:16:57 EDT