L2/07-392 Date: Wed, 17 Oct 2007 21:29:13 +0100 From: Michael Everson Subject: Irish comments on L2/07-340 "OGHAM SPACE MARK shouldn't be whitespace" Mark Davis' suggestion that U+1680 OGHAM SPACE MARK is not white space seems to be based on a misunderstanding of the use of the character. His assertion "Users of the UCD expect that whitespace characters are, well, white space -- that is, that they do not have visible glyph in normal usage" is, in our view, incorrect. Users of the **Ogham script** do, in fact, expect this character to act as a white space. Other users of the UCD are unlikely to have expectations about the Ogham script at all. An Ogham font may have a stemline. The stemline represents the edge of a stone in Ogham inscriptions, or the conventional written stemline written in ink in manuscripts. However, an Ogham font may also have **no** stemline. The annex describing font design in Irish Standard 424:1999 states this clearly: ===== 1. Cuid roghnach den chló is ea an líne láir. Is gnách líne láir a úsáid in Ogham na lámhscríbhinní agus sa chlódóireacht, ach ní gá sin. Nuair nach bhfuil líne láir in úsáid, ba cheart charachtar COMHARTHA OGHAIM SPÁS a dhearadh ar an ngnáthleithead, ach é a fhágaint bán ar aon dul le SPÁS. 1. The centre line is optional. In printing and in manuscript Ogham it is conventional to design with a centre line, but this is not necessary. In implementations without the centre line, the character OGHAM SPACE MARK should be given its conventional width, and simply left blank like SPACE. ===== See http://www.evertype.com/standards/iso10646/pdf/is434.pdf The OGHAM SPACE MARK **may** have a visible glyph. But it also **may not** have a visible glyph. And when it does have a visible glyph, that glyph is not actually part of the letter, any more than the stemline in OGHAM LETTER FEARN is a part of the letter. OGHAM LETTER FEARN is three strokes to the right of the edge of the stone. That edge may be drawn in a font (in which case OGHAM SPACE MARK should have a visible glyph) or it may not be (in which case OGHAM SPACE MARK should not have a visible glyph). The stemline is more an indication of layout; it is not an integral part of the letter. It is incorrect to claim that the conventional representation of OGHAM SPACE MARK is similar to ETHIOPIC WORDSPACE. ETHIOPIC WORDSPACE is a conventional set of dots used to separate words. This is not what OGHAM SPACE MARK is. The definition of White Space given here: http://www.unicode.org/Public/UNIDATA/UCD.html#White_Space is: 'Those separator characters and control characters which should be treated by programming languages as "white space" for the purpose of parsing elements.' There is no statement that it actually has to be "white space" on the page. If you imagine writing a program to search for Ogham strings, then U+1680 would behave exactly as U+0020 in searching Latin-script text, and indeed in UnicodeData.txt they have identical properties. Mark Davis' suggestion seems to be using a narrower definition, implied by the name "White Space". According to UAX14 on line breaking: "The Ogham space mark is rendered visibly between words but should be elided at the end of a line." This is also the correct behaviour. The Irish National Body requests that the Unicode Technical Committee to make **NO CHANGE** to the properties of the OGHAM SPACE MARK. It is correctly specified at present. Its specification at present meets the need of the user community, and we oppose the proposed change. -- Michael Everson Convener NSAI/ICTSCC/SC4 "Codes, Character Sets, and Internationalization"