L2/13-115 Author: Roozbeh Pournader (Google) Title: Suggested text for migration section in UAX #9 Date: 2013-05-06 I had a UTC action item to supply some text for UAX #9, to be put in the new Migration/Implementation Notes section. This is to document the still-existing problem with slash etc, document in L2/12-288: http://www.unicode.org/L2/L2012/12288-solidus-and-bidi.pdf Here is my suggested text: "In Unicode 4.0.1 (released March 2005), the bidirectional type of some very common characters changed. Among others, these included U+002B PLUS SIGN, U+002D HYPHEN-MINUS, and U+002F SOLIDUS. The bidirectional type of PLUS SIGN and HYPHEN-MINUS changed from European Number Terminator (ET) to European Number Separator (ES), while that of SOLIDUS changed from European Number Separator (ES) to Common Number Separator (CS). This resulted in applications created using data tables from Unicode 4.0.0 and earlier behaving different in displaying and editing bidirectional documents than those created using later data tables, a problem that persists to this day: As of 2013, the change has not been thoroughly implemented in some very popular pieces of software, and there are inconsistencies in how these characters are treated in bidirectional text, even among latest versions of popular products from the same vendor or even in the same product. It is expected that such discrepancies in implementations would continue to exist for a several more years. Later changes in the Unicode Standard, including those made in handling paired brackets in Unicode 6.3, are expected to cause a similar discrepancy in bidi document interchange. For example, the Unicode strings <062A, 0020, 0031, 002F, 0032> (which contains a pattern very common in displaying dates and fractions in Arabic and Persian) and <062A, 0020, 0061, 0028, 0062, 0029>, are both displayed differently in different common browsers. Web pages, plain text files, and other documents that use such strings without directional formatting characters are thus not interoperable. In the meanwhile, document authors and applications used to author bidirectional documents are recommended to make use of relevant directional formatting characters or higher level markup to make sure their documents are interoperable. Testing bidirectional documents in different viewers (especially browsers) is highly recommended. Applications that display bidirectional documents, when handling text known to be authored using a different platform or application known to use a certain version of Unicode tables, may prefer to display the document as the author originally intended and saw, instead of displaying it according to the latest version of the Unicode Bidirectional Algorithm or data tables. For example, a webmail application that is displaying an email authored on Windows 8 to a user using Windows 7 may insert directional formatting characters in order to achieve the original intended display. Archival systems are suggested to keep the version of the Unicode Bidirectional Algorithm and data tables used in creating the document together with the document, in order to be able to later retrieve them accurately. This is especially important with financial and legal documents, since the characters and patterns that changed behavior are common in such documents. Note that this also affects documents and strings in non-Unicode character sets, like those encoded in Windows-1256 or ISO/IEC 8859-8, as most modern implementations handle such documents by internally converting them to Unicode."