L2/05-065 Date: Tue, 8 Feb 2005 13:39:30 -0500 From: Mark Davis Subject: UTR #36 issues My notes from the last meeting for the document include the following: Add description of in-script cases like U+0110 (?) Latin Capital Letter D With Stroke versus U+00D0 (Ω Latin Capital Letter Eth, also cases of visually confusable punctuation Fix 2a/b to be parallel to other cases Describe problems with number parsing, including cases like U+0B68 (?) Oriya Digit Two, which looks like a 9. Software commonly looks at just the numeric value of a sequence of characters will interpret the numeric value differently than what the user expects. Give more background as to why normalization fixes certain problems, and which it does not fix. Describe how implementations of normalization can use small data set limited to only supported characters. Describe the recommended use of normalization in non-domain part of URL. Describe how reverse-bidi (visual order -> storage order) can be used to detect bidi spoofs. That is: one can apply bidi then reverse bidi: if the result does not match the original, then reject the string. Explain that private use characters can cause security problems, and recommend against their use. Fonts: should follow the Unicode recommendations for missing glyphs, making visible distinctions among them. Descript best practices for invisible glyphs. Describe cases in complex languages (eg Indic) where the same visual appearance may result from two different undering character sequences -- in the right context. Add more description on the recommended use of tool-tips and other mechanisms for alerting users. If people have other items, I'd appreciate feedback (and text for inclusion!).