L2/03-303 Date/Time: Mon Aug 25 06:11:13 EDT 2003 Contact: jhi@iki.fi Report Type: Public Review Issue * In the "4. Pattern syntax" there are spots like ... removing some characters that appeared inappropriate for patterns ... ... then some script-specific characters were removed, along with some other characters that appeared inappropriate for patterns. These explanations appear quite vague and without more rationale quite arbitrary. Even though I believe there were good reasons for the removals, without knowing the details the above sounds a bit haphazard. * I propose adding an "Implementation Note" or something along those lines which mentions one real complication of introducing Unicode to identifier names. That is, in many programming languages some language identifiers "leak" into the filesystems namespace, both as file and directory names, like for example in Java the class names. This means that the filesystems used must be capable of storing Unicode somehow, either by builtin support for some Unicode encoding like UTF-8 or UTF-16 (the latter being more problematic because of embedded zero bytes), or by convention (for example: filenames under certain directories could be known to be in UTF-8). EOF -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --