From: Asmus Freytag (email@example.com)
Date: Tue Mar 30 2004 - 01:10:35 EST
At 12:19 PM 3/29/2004, Ernest Cline wrote:
> > [Original Message]
> > From: Peter Kirk <firstname.lastname@example.org>
> > On 29/03/2004 06:56, John Cowan wrote:
> > >Peter Kirk scripsit:
> > >
> > >>Using NBSP rather than SPACE has several advantages, and has long
> > >>been specified in Unicode, although not widely implemented. It is less
> > >>likely to occur accidentally. But it has disadvantages, especially that
> > >>it will always be a spacing character, whereas for display of isolated
> > >>Indic vowels no extra spacing is required.
> > >
> > >You don't actually say so, but you give me the impression that you think
> > >NBSP is a fixed-width space. It isn't; it can assume any width greater
> > >than zero, just as SPACE can; in particular, when used before a NSM, I
> > >would expect it to have the same width as the NSM.
> > Well, as I understand it NBSP is often expected to be a fixed-width
> > space, and it is in many implementations. In fact I think it ought to
> > be, whether or not this is actually specified. But there ought to be a
> > character which is explicitly NOT fixed width to carry NSMs. Also
> > you do say that NBSP must have a width greater than zero, but for
> > some combining marks (those which are not non-spacing, and
> > arguably even some which are) this base character should have
> > zero width.
>UAX #14 makes a rather definitive statement on this issue, albeit
>in an obscure place, in Section 3: Introduction.
4.0.1 will amend that section to correct the wrong impression that NBSP is
fixed width and to clarify that this statement is not intended to cover any
specialized cases, but just ordinary typographical conventions:
When expanding or compressing inter-word space according to common
typographical practice, only the spaces marked by U+0020 SPACE,
U+00A0 NO-BREAK SPACE, and U+3000 IDEOGRAPHIC SPACE are subject
to compression, and only spaces marked by U+0020 SPACE,
U+00A0 NO-BREAK SPACE, and occasionally spaces marked by
U+2009 THIN SPACE are subject to expansion. All other space
characters normally have fixed width. When expanding or
compressing inter-character space the presence of
U+200B ZERO WIDTH SPACE or U+2060 WORD JOINER are always ignored.
I'm sorry if the fact that the placement and context of text was not enough
to guide the reader. Note that the 'obscure place' was in the
introduction (!) of the UAX, where it was a mere note on a subject not
actually covered by the UAX (i.e. line layout) that nevertheless forms
the context in which linebreaking happens.
Next, people will extract normative statements from the book cover. ;-0
Now that this is settled, all can go on discussing the main point:
>While one can argue as to whether this has anything to do with the
>effect on the width of NBSP with a combining character following
>it or not, it is clear that clear that one should not assume that NBSP
>is treated exactly the same as SPACE except for not breaking a line.
>Indeed, I would prefer to see NBSP treated as a fixed-width character
>that would only be affected by letter spacing in all contexts, including
>when it has an attached combining character.
>The idea of an explicit character to be used as a combining
>character base has merit in my opinion, but only if an acceptable
>standardization of the behavior of combining characters with some
>other character such as SPACE cannot be achieved so that it would
>always be expected to produce an isolated combining character.
>(except when in an intentional show the codes mode)
This archive was generated by hypermail 2.1.5 : Tue Mar 30 2004 - 01:56:51 EST