 Post subject: Normative status of the UTR#50 material
PostPosted: Fri Oct 14, 2011 2:19 pm 
It is well-understood in the Unicode community that the goal of The Unicode Standard and the related standards and technical reports is to facilitate the reliable interchange of text and to help implementers, while giving maximum freedom to the users of the standard. To that effect, we have a fairly clear separation between what needs to be made mandatory and what is provided as a basis and can be adapted to the circumstances.

A good example is in the context of line breaking. If you read UAX#14, you will see that the only thing Unicode strongly mandates is that a line break occur at newline character (because that's the essence of a newline character), and that combining sequences should not be broken (because that's the essence of a combining sequence). In all the other cases, the UAX#14 answer is only a suggestion, and is to be understood as "if you don't know better and don't have another stronger indication, it's probably a good/bad idea to break here." Of course, we make it the best suggestion we can, hence the long list of linebreak classes and rules.

It has come to my attention that I did not describe the normative status of the UTR#50 material and that this has lead to some confusion. Sorry about that, I should have known better. Here is a proposed addition to the text, either at the end of section 1, or in a new section following it.

The properties and algorithms presented in this report are informative. The intent is to provide a reasonable determination of the spacing and orientation of characters in Japanese texts, which can be used in the absence of other information, but can be overridden by the context, such as markup in a document or preferences in a layout application. This determination is based on the most common use of a character, but in no way implies that that character is used only in that way.

In an XML markup, assuming the existence of a <span> element, one could define two new attributes to explicit specify the class and orientation of character occurrences, e.g.:

...25<span eac="cl-19.3" eao="U">Ω</span>...

to give an ideographic treatment to a Greek letter.


