Pre-proposal for SCSU updates

From: Doug Ewell (
Date: Mon Nov 01 2010 - 15:50:51 CST

  • Next message: srivas sinnathurai: "What is Phonemic"

    I'd like to try to gauge the community's interest, if any, in some
    possible updates to UTS #6 and the SCSU mechanism, as follows:

    (1) Updating the spec to add dynamic-window offsets 0xA8 through 0xBF,
    to permit encoding the blocks from U+A000 through U+ABFF in single-byte
    mode. This would allow the many small alphabets assigned to this range,
    such as Bamum and Syloti Nagri and Phags-Pa, to be encoded efficiently
    using SCSU. Other offsets could be added as well, such as for Hangul
    Jamo Extended-B.

    (2) Updating the spec to assign "reserved" tag bytes 0x0C (single-byte
    mode) and 0xF2 (Unicode mode) as "reset all" commands, similar to 0xFF
    in BOCU-1. This would allow more efficient encoding in some cases, as
    well as providing a possible synchronization mechanism for decoders. As
    an alternative, these unused tag bytes could be released for normal,
    non-reserved use, so they would no longer require escaping.

    (3) Providing an informational section in UTS #6 on "line-safe SCSU," a
    special-purpose SCSU encoding profile in which all state is returned to
    the default at the end of each line, and all lines are terminated with

    I'm aware that many people have been discouraging the use of SCSU
    altogether, on the basis of Web-page security concerns or the reputation
    of SCSU as "difficult to implement." These people will not be affected
    one way or another by any enhancements to SCSU, and I am not focusing on
    them at present.

    Doug Ewell | Thornton, Colorado, USA |
    RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s ­

    This archive was generated by hypermail 2.1.5 : Mon Nov 01 2010 - 15:54:13 CST