20 May, 1994 I'd like to make two clarifications to the Andy Feibus column that discussed Unicode in the April 25, 1994 edition of _Open Systems Today_. 1. Mr. Feibus' statement, that "ISO incorporated this work [Unicode] into ISO 10646 . . . ", while true in itself, misses an important aspect of the history in the development of ISO/IEC 10646-1:1993 and Unicode Version 1.1 because the ISO 10646 working group and the Unicode consortium agreed to merge their respective codes into one code. Let me give you some background. In 1991, the ISO working group developing 10646 and the Unicode consortium had the some goal of developing a multi-byte code for the world's characters but were proceeding along divergent paths. In particular, the coding philosophies and implementations were incompatible so that conversion between the two codes would be, at best, extremely difficult. At the time, it was clear (a) that several vendors were already writing Unicode support into their software and (b) that many countries would require compliance with the future 10646 international standard in their procurements. Consequently, many information system vendors were deeply concerned about the expense of supporting two incompatible multi-byte codes. I represent the SHARE organization, which is an association of IBM customers, to the U.S. X3L2 technical standards committee for codes and character sets (for example, ASCII, X3.4:1986). SHARE Members had too many difficulties with converting between the 100-character 7-bit ASCII and 8-bit EBCDIC codes to want to face the conversion problems inherent with two multi-byte codes with characters sets of at least 65,000 characters. Although some people disagreed with my rhetoric, I termed the coming situation "a disaster for the information industry." What resulted from these customer and vendor concerns was one of the little-known successes in our industry. Starting in May, 1991, people from the Unicode Consortium and the ISO working group met three times to negotiate an agreement to merge the two codes. After much give and take on both sides, they agree on a merger of features from both Unicode V1.0 and the ISO/IEC draft international standard 10646:1990. In 1992, the national standards organizations voted to adopt the second version of the ISO/IEC 10646 draft international standard (the merged code) as an international standard. ISO published the standard as ISO/IEC 10646-1:1993 and it is available from ANSI (the American National Standards Institute) in New York City (phone 212/642-4900). Meanwhile, the Unicode Consortium modified its code so that Unicode V1.1 complies with the 2-byte form of the ISO/IEC 10646-1 standard. Since then, the ISO working group and the Unicode Consortium have continued to cooperate for the enhancement of the standard. Information about the Unicode Consortium may be obtained by calling 408/777-5870 or via e-mail from unicode-inc@unicode.org. Without the merger, our computers would have wasted countless cycles converting between Unicode and 10646. The main point, again, is that although the ISO/IEC 10646-1 standard incorporated most of the Unicode V1.0 features (as stated in the column), it also included important features of earlier draft versions of 10646. One was the ability of 10646 to encode up to 2,000,000 characters (with its 4 byte form). Although the Unicode code-space (that is the 2-byte form of 10646) comprising 65,000 characters is likely sufficient for the commercial market, the larger coding space available with the 4-byte form of 10646 will provide bibliographers and scholars access to the characters in rare and dead writing systems. The Unicode Consortium's latest estimate for the number of characters to be encoded is around 250,000 characters. This is a far cry from the 65,000 code-space limitation of the original Unicode 1.0 version. 2. Mr. Feibus also stated, "Unicode specifies that strings be stored in their natural order". That is true, but he continued with "--for instance, Hebrew from right to left, Latin languages from left to right." The continuation is right in that some writing systems are right to left and others are left to right, but the _ordering_ of characters in the computer is always the same. Unicode and 10646 store character strings in all languages in the same (natural) order from the first character in the string to the last one. However, the rendering process that displays and prints the character string must decide that strings of Hebrew characters are rendered from right to left on the screen or on the paper, and that strings of Latin characters are rendered from left to right. In summary, storage of Unicode and 10646 strings is from first character to last, but the rendering process makes the right-to-left and left-to-right distinctions based on the writing system. These represent my personal views rather than those of SHARE, the Applied Physics Laboratory or the standards committee. Edwin Hart Chairman of the U.S. X3L2 technical standards committee for codes and character sets The Johns Hopkins University Applied Physics Laboratory Laurel, MD 20723-6099 Edwin.Hart@jhuapl.edu