L2/07-194 Date/Time: Tue May 15 00:01:07 CDT 2007 Contact: yacob@geez.org Name: Daniel Yacob Report Type: Public Review Issue Opt Subject: PR #96 Comment This comment is available as an MS Word document and I can submit it via email if requested. Ethiopic Wordspace (U+1361) in Identifiers This is a multipart comment concerning the use of the Ethiopic wordspace character in identifiers. The Ethiopic letter symbol is presently in the “Punctuation Other” class. It is proposed here that the symbol be given a joiner property within identifiers. A “joiner property” to the same degree that hyphen and underscore become joining characters in identifiers. The symbol serves the same fundamental purpose in providing a non-intrusive visual word boundary. As a joining symbol in the identifier context, it is further proposed that the Ethiopic wordspace be optional. When present it will be ignored and string interpreters may elide the symbol from an identifier. The intention here is to avoid potential phishing problems that may occur in IDN usages. Rather than add the elision rule as a special rule for IDNs, it is recommended for all identifiers. For example: የኢትዮጵያ፡ንግድ፡ባንክ.com የኢትዮጵያንግድ፡ባንክ.com የኢትዮጵያ፡ንግድባንክ.com የኢትዮጵያንግድባንክ.com የ፡ኢ፡ት፡ዮ፡ጵ፡ያ፡ን፡ግ፡ድ፡ባ፡ን፡ክ.com would all be equivalent, referring to the same domain and requiring only a single registration. Given the syllabic nature of the writing system, occasions where the elision rule could lead a single concatenation of independent words sets should be very rare I.E. የኢትዮጵያ፡ንግድ፡ባንክ should be the only meaningful sequence of strings within የኢትዮጵያንግድባንክ and it should be very rare that another sequence, like የኢትዮጵያን፡ግድባንክ, would also be meaningful. Validity Pattern: The wordspace must both be preceded and followed by a non-punctuation (i.e. letter or numeric) Ethiopic letter symbol: /[\p{Ethiopic} -\p{Punctuation Symbol}]፡?[\p{Ethiopic} -\p{Punctuation Symbol}]/ IDN Example: hxxp://www.የኢትዮጵያንግድባንክ.com/ hard to read hxxp://www.የኢትዮጵያ-ንግድ-ባንክ.com/ minor strain to read hxxp://www.የኢትዮጵያ፡ንግድ፡ባንክ.com/ easiest, most natural to read Programming Language Example: $አዲስስም = … # hard to read $አዲስ_ስም = … # looks bizarre $አዲስ፡ስም = … # easiest, most natural to read