Resolved Public Review Issues

Public Review Issues

Tech Site | Site Map | Search

Resolved Public Review Issues 1-99

This page lists recent Public Review Issues numbered 1 - 99 which have been resolved, in reverse order by issue number. The link on the title points to a background document if any is available. For open issues, please see the Public Review Issues page. For later resolved public review issues, please see the next Resolved Issues page.

99 Proposed Draft UTR #33, Unicode Conformance Model 2007.01.30

The UTC has released an updated draft of the technical report describing the conformance model for the Unicode Standard. Review and feedback are welcome.

Resolution: Closed 2007-02-15. The draft will be updated with feedback and published.

98 Ideographic Variation Database Submission 2007.03.15

The Ideographic Variation Database provides a registry for collections of unique variation sequences containing unified ideographs, allowing for standardized interchange according to UTS#37, Ideographic Variation Database. A submission to the Ideographic Variation Database has been received for: "Combined registration of the Adobe-Japan1 collection and of sequences in that collection". Details are in the background document.

Resolution: Closed 2007-03-20. A new submission will be made to incorporate the feedback.

97 Proposed Draft UTR #38, A User's Guide to the Unihan Database 2007.05.08

The Unihan database is the repository for the Unicode Consortium’s collective knowledge regarding the CJK Unified Ideographs contained in the Unicode Standard. It contains mapping data and additional information to help implement support for the various languages which use the Han ideographic script. A new proposed draft UTR #38, A User's Guide to the Unihan Database, is available for public review and comment.

Resolution: Closed 2007-05-29. The proposed draft UTR will move forward to become a draft UAX.

96 Allowing Joiner Characters in Identifiers 2007.05.14

The use of format characters in identifiers is problematical because the formatting effects they represent are considered merely stylistic or otherwise out of scope for identifiers. To make matters worse, it's possible to misapply format characters such that users can create strings that look the same but actually contain different characters. For these reasons format characters are normally excluded from Unicode identifiers. The background document discusses a proposal to allow joiners in identifiers in certain contexts.
Background document updated 2007/04/30.

Resolution: Closed 2007-05-29. The text and rules resulting from the PRI will be incorporated into an appendix of UAX #31.

95 Stable Normalization Process 2006.10.31

The UTC is considering adding the specification of a "Stable Normalization Process" to UAX #15, Unicode Normalization Forms, and requests public feedback on the proposed specification. For details, see the background document.

Resolution: Closed 2006-11-27. The UTC decided to produce a proposed update for UAX #15 that defines a "normalization process for stable strings" (NPSS), and post for review.

94 Proposed Update to UTS #10: Unicode Collation Algorithm 2006.07.31

Draft UTS #10 is available for public review. The main changes are the incorporation of informative material from UTN #9, and updated references to Unicode 5.0 and to UCA 5.0 data tables.
Updated 2006.06.13. The data files for UCA 5.0 are now available for public review:
https://www.unicode.org/Public/UCA/5.0.0/allkeys-5.0.0.txt
https://www.unicode.org/Public/UCA/5.0.0/
The only changes compared to UCA 4.1 are:

The addition of weights for Unicode 5.0 characters

Changed weights for eight Unicode 4.1 characters:
a. to match new lowercase forms
       Ⅎ   U+2132 TURNED CAPITAL F
       Ↄ   U+2183 ROMAN NUMERAL REVERSED ONE HUNDRED
b. in accordance with feedback
       ⵯ   U+2D6F TIFINAGH MODIFIER LETTER LABIALIZATION MARK
c. reconciling multiple numeric variants within a script
       ৴   U+09F4 BENGALI CURRENCY NUMERATOR ONE
       ...
       ৷    U+09F7 BENGALI CURRENCY NUMERATOR FOUR
d. insuring last position in the script
       ്   U+0D4D MALAYALAM SIGN VIRAMA

Resolution: Closed 2006-07-17, after formal approval of the UCA 5.0 draft by UTC letter ballot.

93 Representation of Malayalam /au/ Vowel in Traditional and Reformed Orthography 2006.05.09

The vowel /au/ in Malayalam is represented differently in the traditional and reformed orthographies. This public review issue relates to the representation of these vowels. Details are provided in the background document.

Resolution: Closed 2006-05-30. After reviewing all feedback, the UTC decided to accept option "A" of the background document and use a different spelling between traditional and reformed orthographies for Malayalam vowel /au/.

92 Proposed Draft UTS #40: BOCU-1 MIME-Compatible Unicode Compression 2006.05.09

This document describes a Unicode compression scheme that is MIME-compatible, directly usable for e-mail, and preserves binary order (for databases and sorted lists). It replaces UTN #6 and adds a formal description of the algorithms, without substantially changing the specification.

Resolution: Closed 2006-05-30. No action has been taken. The UTC thanks reviewers for their feedback.

91 Proposed Update to UAX #9: The Bidirectional Algorithm 2006.05.09

Conformance to the Unicode Bidi Algorithm (UAX #9) has been tightened in the area of bidi mirroring. The list of characters with the Bidi_Mirrored property has also been extended for consistency. Several other editorial clarifications have been made.

Resolution: Closed 2006-05-30. The UAX will be updated and published for Unicode 5.0.

90 Unicode 5.0 Beta 2 2006.05.09

The Unicode Consortium has decided to issue Beta 2 of the Unicode Character Database for Unicode 5.0. This extends the feedback period until May 9, 2006. The relevent public review issues for UAXes have also been extended to the same date. Some information on changes and updates to the UAX #9 beta will be announced soon. Data files will also be updated during this period.
During the extended beta review period for Unicode 5.0, the UTC is seeking feedback on potential errors or inconsistencies in all of the data files. However, please note that some of the character properties will be frozen as of March 1. The freeze will apply to all properties defined in the following files: UnicodeData.txt, Scripts.txt, and EastAsianWidth.txt; to a specific list of properties from PropList.txt (White_Space, Hex_Digit, Diacritic, and Ideographic); and to two derived properties, Numeric_Value and Numeric_Type. Substantive feedback received after March 1 regarding any of those properties will be recorded and taken into consideration in review of future versions of the standard, but will not be reflected in modifications for Unicode 5.0.

The Unicode 5.0 beta data files are available at https://www.unicode.org/Public/5.0.0/ucd/. General information regarding these data files is available at https://www.unicode.org/Public/5.0.0/.

Resolution: Closed 2006-05-30. The UTC thanks reviewers for their feedback.

89 Proposed Update to UTR #23: Unicode Character Property Model 2006.05.09

This proposed update reflects changes to the definitions that are planned for the forthcoming Unicode Version 5.0 and includes a new section on the difference between code point properties and abstract character properties.

Resolution: Closed 2006-05-30. The UTR will be updated and published.

88 Proposed Update to UAX #14: Line Breaking Properties 2006.05.09

The UTC has modified the conformance clauses of UAX #14 and the text they reference. These changes clarify precisely what is tailorable in conformant implementations and what is not. The non-tailorable results are limited to interactions among a small set of well-defined core characters, such as CR, LF, NBSP, SP, and so on, where the semantics of the characters is bound up in how they linebreak.
Please see the background document for details of other changes and items to review.

Resolution: Closed 2006-05-30. The UAX will be updated and published for Unicode 5.0.

87 Proposed Update to UAX #24: Script Names 2006.05.09

This proposed update contains a proposed change in default script value for unassigned characters from Common to a new value Unknown, and a correction for the contents of the Script=Inherited value.

Resolution: Closed 2006-05-30. The UAX will be updated and published for Unicode 5.0.

86 Proposed Update to UAX #15: Unicode Normalization Forms 2006.05.09

There are no substantive changes in this version of UAX #15. Sections were added to clarify stability and versioning issues, and to make some formatting changes for Unicode 5.0.

Resolution: Closed 2006-05-30. The UAX will be updated and published for Unicode 5.0.

85 Proposed Update to UAX #31: Identifier and Pattern Syntax 2006.05.09

Clarifying text has been added for ideographs and the use of additional characters in identifiers.

Resolution: Closed 2006-05-30. The UAX will be updated and published for Unicode 5.0.

84 Proposed Update to UAX #29: Text Boundaries 2006.05.09

A number of changes have been made to simplify implementations and cover edge cases in the rules.

Resolution: Closed 2006-05-30. The UAX will be updated and published for Unicode 5.0.

83 Changing Glyph for U+047C/U+047D Cyrillic Omega with Titlo 2006.08.01

UTC has received information indicating that the glyphs for U+047C and U+047D should be changed. In the accompanying figure below, the current shape is shown on the left. The proposed new shape is shown on the right. UTC will move to implement this change if no information to the contrary is received by the end of the review period.

Resolution: Closed 2006-08-22. Cyrillic Omega with Titlo is left unchanged. The assessment of the UTC is that changing the glyph to be the glyph for "beautiful omega" is inappropriate, and any such character should be encoded separately.

82 Representation of Gurmukhi Double Vowels 2006.01.30

In older Gurmukhi, some texts use two vowel signs on a single consonant; for example, one can find ga with both the oo and u vowel signs. A priori, this can be represented in Unicode using two different sequences. Details of a proposal regarding this situation is found in the background document.

Resolution: Closed 2006-03-03. UTC decided to use the first sequence, top then bottom, for Gurmukhi double vowels.

81 Proposed Update to UAX #34: Unicode Named Character Sequences 2006.05.09

A provisional process for the approval of named character sequences has been added to the text of this UAX. A data file containing provisional named character sequences is now available, separate from the list of approved named character sequences. See: https://www.unicode.org/reports/tr34/tr34-4.html Please review the provisional entries in NamedSequencesProv.txt, as well as the proposed text of the update.

Resolution: Closed 2006-05-30. The UAX will be updated and published for Unicode 5.0.

80 Proposed Update to UAX #9: The Bidirectional Algorithm 2006.01.30

The Unicode Bidi algorithm has allowed for a great deal of flexibility in determining which characters are to be mirrored (see HL6 https://www.unicode.org/reports/tr9/tr9-16.html#HL6). Unfortunately, that means that text that originates with one person may show up with the wrong graphic to another, thus causing the text to be misinterpreted. The proposal is to tighten up conformance by eliminating overriding of bidi mirroring, and at the same time extending the characters with the Bidi_Mirrored property.
The UTC would like public feedback on whether to make this change, and which characters should have the Bidi_Mirrored property. The proposed change is to retain the set of characters currently having Bidi_Mirrored property, and add some additional characters with similar properties. For further information and character lists, see the background document for this issue.

Resolution: Closed 2006-02-25. Some changes were made in the draft as a result of feedback and UTC decisions. A new public review issue will be posted.

79 Proposed Updates to UAX #29: Text Boundaries and UAX #31: Identifier and Pattern Syntax 2005.10.28

There are some small changes to these two UAXes. In particular, clarifying text has been added to indicate that identifiers which are intended to represent words of natural languages should take into account some additional characters such as hyphens, apostrophes, and joiners. See:
https://www.unicode.org/reports/tr29/tr29-10.html
https://www.unicode.org/reports/tr31/tr31-6.html

Resolution: Closed 2005-12-08. Some changes were made in the drafts as a result of feedback and UTC decisions. New public review issues will be posted for both UAX #29 and UAX #31.

78 CLDR 1.4 Design Phase 2005.11.07

The design phase for CLDR 1.4 has made a number of structural additions to LDML, including flexible date/time formatting; tailorable text segmentation (e.g. word/line breaks), rule-based number formats, and transforms (transliterations); additional LDML metadata; localized names of measurement systems; and localized calendar quarters. See the latest working draft at https://www.unicode.org/reports/tr35/tr35-6.html.
Feedback on these additions is welcome. Note: it should not be provided via the online Unicode forms; instead, ~~use the CLDR bug reporting: https://www.unicode.org/cldr/filing_bug_reports.html~~.

Resolution: Closed 2005-12-09.

77 Proposed Draft UTS #39 and Proposed Update UTR #36 2005.10.28

The sections of UTR #36: Unicode Security Considerations that pertain to security functions have been split off into a new proposed draft UTS #39: Unicode Security Mechanisms. In addition, a section on some of the problems with language-based security has been added to UTR #36. We would appreciate feedback on the proposed changes, and comments on the security issues highlighted in UTR #36. See:
https://www.unicode.org/reports/tr36/tr36-4.html
https://www.unicode.org/reports/tr39/tr39-1.html

Resolution: Closed 2006-02-16. Proposed Draft UTS #39 will be finalized to UTS #39 after incorporation of feedback. Proposed Update UTR #36 will be updated and posted for another round of public feedback.

76 Draft UTS #37, Ideographic Variation Database 2005.10.28

The UTC approved the development of a Unicode Technical Standard establishing a database of variation sequences for Ideographic characters. The UTC recognizes that the needs of various user communities for such variation sequences cannot be accommodated by a single, unified collection of sequences. The purpose of the database is to ensure that multiple collections can coexist without compromising the interchangeability of texts using them. This draft Unicode Technical Standard describes the operation of the database.

Resolution: Closed 2005-12-08. The draft was approved as a UTS.

75 Proposed Update UTR #25, Unicode Support for Mathematics 2007.01.30

UTR #25, Unicode Support for Mathematics, is being updated to account for recent and pending additions to the character repertoire of mathematical characters in the Unicode Standard.
Draft refreshed on 2007-01-30.

Resolution: Closed 2007-02-15. The draft will be published after incorporation of feedback and updates.

74 Change to Default Localization for NaN in CLDR 2005.10.31

There has been a request to change the default localization for a NaN from the character U+FFFD (�) REPLACEMENT CHARACTER to another representation. The NaN floating-point value means "Not a Number", and represents an undefined result of a mathematical operation such as (0 ÷ 0) or (∞ - ∞). Unfortunately, there is no generally accepted mathematical symbol for NaN (e.g., from the American Mathematical Society). The character currently used as the default (root) localization follows Java usage, where it was originally chosen because it is a symbol (thus not an English-specific abbreviation), and has a sense that roughly corresponds to NaN. The CLDR technical committee is somewhat reluctant to make a change, given that this has been in use in Java for many years. If there is a change, possibilities are to revert to the English abbreviation "NaN" or to chose another character such as U+26A0 (⚠) WARNING SIGN. The committee would appreciate comments on this issue.

Resolution: Closed 2005-12-09.

73 Representative Glyphs for Arabic Characters U+06DF, U+06E0, and U+06E1 2005.08.09

The representative glyphs for several Arabic characters used to annotate the Koran have been reported as being possibly incorrect. The UTC has tentatively decided to revise them as explained in the accompanying document. The UTC invites anyone knowledgeable in their use to provide additional information or recommendations.

Resolution: Closed 2005-08-17. UTC will change the representative glyphs for 3 characters (U+06DF, U+06E0, and U+06E1), but not for U+06E9. UTC intends to document glyphic variations of U+06E9 in a future version of the standard.

72 Stability of the Bidi Mirrored Property 2005.08.09

In a bidirectional context, the images of many characters need to be oriented depending on the writing direction of the text in which they occur. The Bidi Mirrored property defines this behavior, but the model it implements has a few inconsistencies. Ideally, these would be corrected, but doing so will destabilize all documents containing the affected characters. A stability policy would freeze future changes, but should one last round of improvements be carried out? Your input on the range of possible actions is solicited, whether you are a language expert, user. or implementer.

Resolution: Closed 2005-08-17. UTC chose option B of the background document and will issue a Proposed Update UAX #9 to clarify section 6, and a new Public Review Issue will be posted on the specific content of the Bidi Mirroring property.

71 Questions on Malayalam Digits 2005.08.09

It has come to the attention of the UTC that the glyph printed in the standard for U+0D66 MALAYALAM DIGIT ZERO is incorrect, and information is being sought. Three numeric signs have been identified for future encoding, and further information is being sought. Details on these questions are in the accompanying document.

Resolution: Closed 2005-08-17. UTC accepted the glyph change for U+0D66, and also accepted the Malayalam numeric signs for 1/4, 1/2, and 3/4 as well as 10, 100, 1000 for encoding in a future version of the standard. (Details of the encoding status and progress may be followed on the Pipeline page.)

70 Proposed Draft UTS #37, Registration of Ideographic Variation Sequences 2005.08.09

The UTC approved the development of a Unicode Technical Standard establishing a registry of variation sequences for Ideographic characters. The UTC recognizes that the needs of various user communities for such variation sequences cannot be accommodated by a single, unified collection of sequences. The purpose of the registry is to ensure that multiple collections can coexist without compromising the interchangeability of texts using them. This proposed draft Unicode Technical Standard describes the operation of the registry.

Resolution: Closed 2005-08-17. Proposed Draft UTS #37 will be advanced to Draft, and a new PRI will be posted when it is available.

69 Proposed Update UAX #24, Script Names 2005.08.09

This is an initial update for Unicode 5.0.0 which proposes the addition of an appendix of iconic script indicators, derivative from the usage of last resort font missing glyph forms. Additional updates will be needed to reflect the planned additions of scripts to 5.0.0. Those will be in a subsequent draft.

Resolution: Closed 2005-08-17. The appendix with a list of iconic indicators will be removed from the UAX.

68 Proposed Update UTS #10 Unicode Collation Algorithm 2005.04.26

The Unicode Technical Committee has released a beta version of UTS #10 Unicode Collation Algorithm (UCA) Version 4.1.0. It is available for public review until April 26. This is a running beta; the data files may be updated during the course of the beta. This beta provides a data table that is synchronized with the repertoire of Unicode 4.1.0. In addition, it includes:

a revised handling of Thai/Lao via contractions

enhancements to sorting and matching, with a new conformance clause

changes to the handling of ignorable characters

guidelines on the use of grapheme joiner

new introductory text on user expectations

changes in weights for a small number of characters.

See: https://www.unicode.org/Public/UCA/4.1.0/ for the data files.

Resolution: Closed 2005-05-06. The proposed update was approved. The draft and ancillary files have been updated and published. The final released revision number is 14, available here: https://www.unicode.org/reports/tr10/tr10-14.html.

67 CLDR Version 1.3 Beta (Updated 2005.04.22) 2005.05.10

The 1.3 version of CLDR is now at at beta status, and available via http://unicode.org/cldr/version/1.3.html. The new features include addition of data for timezones, UN M.49 regions, POSIX-format data, LDML meta-data, language/script/territory mappings, new tests, and various other fixes and additions of data, and many extensions to the specification. A new survey tool is being used to vet the additions of data. We encourage people to look at the data and provide feedback, especially on the new POSIX-format data and the changes in the specification. For details on accessing CLDR data, see http://unicode.org/cldr/repository_access.html.
NOTE: Feedback should not be provided via the online Unicode forms; instead, ~~use the CLDR bug reporting method: http://unicode.org/cldr/filing_bug_reports.html~~.

Resolution: Closed 2005-05-19. The review period has ended.

66 Encoding of Chillu Forms in Malayalam 2005.05.03

The UTC is considering the question of encoding 5 "chillu" forms in Malayalam. A brief explanation is in the above-linked background document. Feedback and detailed information on this issue is being sought. (Note: as with other Public Review Issues, this is not a "vote". Submissions that simply favor one approach or another without giving any evidence either way will not be considered by the committee.)

Resolution: Closed 2005-05-18. The UTC accepted the 5 chillu forms for Malayalam, for encoding in a future version of the standard. Also accepted was a sixth chillu form for "ka".

65 Encoding of Devanagari Eyelash Ra 2005.05.03

The UTC is considering the question of encoding a separate character for the Devanagari eyelash RA, (). Reviewers may refer to discussion in The Unicode Standard, Version 4.0, pages 226 and 230. There is no separate background document. Feedback and detailed information on this issue is being sought. (Note: as with other Public Review Issues, this is not a "vote". Submissions that simply favor one approach or another without giving any evidence either way will not be considered by the committee.)

Resolution: Closed 2005-05-18. The UTC has not seen pervasive problems with the representation of eyelash ra. Therefore based on concerns for stability, the UTC re-affirms its position that eyelash ra is represented as indicated in Rule R5 in Section 9.1 Devanagari of The Unicode Standard, Version 4.0.

64 Draft UTR #36: Security Considerations for the Implementation of Unicode and Related Technology 2005.05.03

This draft Unicode Technical Report describes security considerations that are important to be aware of when working with Unicode, and provides specific recommendations for dealing with the issues that arise.

Resolution: Closed 2005-05-18. The UTC decided to issue an approved version after resolution of open issues by the newly-formed security subcommittee.

63 POSIX Data for CLDR 2005.03.31

There is a new tool that creates POSIX locale data files from CLDR. It has been used to generate draft POSIX locale data files for public review. The CLDR Technical Committee is seeking review of this data. Details and relevant URLs are in the above-linked background document. Note that the CLDR 1.3 freeze date has also been extended. Please see http://unicode.org/cldr/version/1.3.html.

Resolution: Closed 2005-05-19. The review period has ended.

62 Proposed Update UTS #10 Unicode Collation Algorithm 2005.01.31

A proposed update for UTS #10: Unicode Collation Algorithm (UCA) for Unicode 4.1.0 is available for public review. The main feature is the update of the repertoire to Unicode 4.1.0 (for synchronization, UCA 4.1.0 will be available within the month after Unicode 4.1.0). Other changes include:

Use of ignorable character (especially CGJ) to interrupt contractions

Changing the mechanism for reordering Thai/Lao characters to use contractions

More detail about alternatives for handling Hangul characters

Additional options in Searching and Sorting

The data tables are not yet available, but the changes can be summarized as: assignments for all new 4.1.0 characters, additions of Thai/Lao contractions, plus changes in the weighting of certain Latin characters, some previously ignorable characters, and modifications of certain other Thai characters.

Resolution: Closed 2005-02-14. The proposed update was approved with changes incorporating editorial and other feedback. Proposals for some extensive changes were not accepted. The draft and ancillary files will be updated and published.

61 Proposed Update UAX #15 Unicode Normalization Forms 2005.01.31

A proposed update to UAX #15 for Unicode 4.1.0 is available at the link above. The proposed changes are listed in the Modifications section of the document.

Resolution: Closed 2005-02-14. The proposed update was approved and will be published as part of Unicode 4.1.0. Some editorial feedback will be included; proposals for some extensive changes were not accepted.

60 Proposed Update UAX #9 Bidirectional Algorithm 2005.01.31

A proposed update to UAX #9 for Unicode 4.1.0 is available at the link above. The proposed changes are listed in the Modifications section of the document.

Resolution: Closed 2005-02-14. The proposed update was approved, with some changes incorporating feedback, and will be published as part of Unicode 4.1.0.

59 Disunification of Dandas 2005.05.03

The UTC is considering the question of disunifying the characters U+0964 DEVANAGARI DANDA and U+0965 DEVANAGARI DOUBLE DANDA from their counterparts in several other Indic scripts. Feedback on this issue, including evidence, for or against the disunification, is being sought. Details are in the background document linked above. (Note: as with other Public Review Issues, this is not a "vote". Submissions that simply favor one approach or another without giving any evidence either way will not be considered by the committee.)

Resolution: Closed 2005-05-18. The UTC has not seen pervasive problems caused by the unification of dandas in the core scripts of India. Therefore based on concerns for stability, the UTC re-affirms its position that dandas are unified across these scripts..

58 Characters with cedilla and comma below in Romanian language data 2005.01.31

The CLDR Technical Committee is seeking feedback regarding the relative frequency of use of the characters with comma below and of the characters with cedilla in Romanian language textual material in widespread implementations, such as databases and operating systems, and in published documents and Romanian websites. The purpose for this feedback is to determine which to use in a default set of locale data for Romanian to be specified for CLDR.

The characters in question are:

U+0219 LATIN SMALL LETTER S WITH COMMA BELOW   and
U+021B LATIN SMALL LETTER T WITH COMMA BELOW

versus:

U+015F LATIN SMALL LETTER S WITH CEDILLA   and
U+0163 LATIN SMALL LETTER T WITH CEDILLA

Please accompany your feedback with information on the source of your data and indicate the extent and nature of your experience with Romanian data processing. Note: There is no other background document for this item.

Resolution: Closed 2005-02-22. The resolution was to use the more distinctive characters, since users of the repository can map the characters together if they want..

57 Changes to Bidi categories of some characters used with Mathematics 2005.01.31

The UTC is considering changing the bidi category of seven compatibility characters from "ET" to "ES":

U+207A SUPERSCRIPT PLUS SIGN
U+208A SUBSCRIPT PLUS SIGN
U+FB29 HEBREW LETTER ALTERNATIVE PLUS SIGN
U+FE62 SMALL PLUS SIGN
U+FE63 SMALL HYPHEN-MINUS
U+FF0B FULLWIDTH PLUS SIGN
U+FF0D FULLWIDTH HYPHEN-MINUS

The UTC is also seeking feedback on the bidi categories of the following characters, and whether to also change these from "ET" to "ES":

U+2212 MINUS SIGN
U+207B SUPERSCRIPT MINUS
U+208B SUBSCRIPT MINUS

All of these characters may be used in connection with mathematical applications. Note: There is no other background document for these proposed changes.

Resolution: Closed 2005-02-14. The changes to all ten characters were approved and will be published as part of Unicode 4.1.0.

56 Proposed Update UAX #14 Line Breaking Properties 2005.01.31

This is a proposed update to a previously approved Unicode Standard Annex. It incorporates some changes in Hangul syllable rules, word separators, U+00A0 as a base for combining marks, and other updates. The UTC is seeking public feedback on these changes.

Resolution: Closed 2005-02-14. The proposed update was approved with minor editorial changes and will be published as part of Unicode 4.1.0.

55 Proposed Change to Character Properties for Two Katakana Characters 2004.11.08

The UTC has received to change the General Category of two characters. Reports indicate that they should not have the General Category "Connector Punctuation" (gc=Pc) because the characters don't connect other elements, they separate elements. The two characters are:
     U+30FB KATAKANA MIDDLE DOT
     U+FF65 HALFWIDTH KATAKANA MIDDLE DOT

The proposal is to change the General Category of those characters from "Pc" (Connector Punctuation) to "Po" (Other Punctuation). (Note: there is no other background document for this issue.)

Resolution: Closed 2004-11-23. The change in General Category of U+30FB KATAKANA MIDDLE DOT and U+FF65 HALFWIDTH KATAKANA MIDDLE DOT from "Pc" to "Po" was accepted and will be documented in Unicode 4.1.

54 Proposed Update UTS #22 Character Mapping Markup Language 2005.01.31

This is a proposed update to a previously approved Unicode Technical Report. It will change to a Unicode Technical Standard, so the update includes a new conformance section. Included in the update are many editorial changes and explicit text about multiple-character mappings.

Resolution: Closed 2005-02-14. The proposed update was approved with minor editorial changes and will be published.

53 Proposed Draft UTR #33 Unicode Conformance Model 2005.05.03

This proposed draft Unicode Technical Report explains the issue of conformance relating to the Unicode Standard so that users better understand the contexts in which products are making claims for support of the standard, and implementers better understand how to meet the formal conformance requirements while satisfying the expectations of their users. It does not alter, augment or override the actual Unicode conformance requirements. Rather it attempts to provide a conceptual framework to make it easier for users and implementers to identify and understand the specific conformance requirements.

Resolution: Closed 2005-05-18. A new draft will be posted with further updates.

52 Proposed Draft UTR #36 Security Considerations 2004.11.08

This draft Unicode Technical Report describes some of the security considerations that should be taken into account by programmers, system analysts, standards-developers, and others when implementing the Unicode Standard and related technologies. The UTC is seeking public feedback on this document.

Resolution: Closed 2004-11-23. The proposed draft will be updated with editorial feedback and advanced to "Draft" status.

51 Proposed Update UAX #29 Text Boundaries 2005.01.31

This is a proposed update to a previously approved Unicode Standard Annex. It contains some important changes in categories for some characters and changes in linebreaking rules. The UTC is seeking public feedback on these changes. Note: The text of UAX #29 has been modified slightly to make it clear that level run and directional run refer to the same thing. Re-posted on 2005-01-17.

Resolution: Closed 2005-02-14. The proposed update was approved and will be published as part of Unicode 4.1.0.

50 Proposed Update UTS #18 Unicode Regular Expressions 2004.11.08

This is a proposed update to a previously approved Unicode Technical Standard. The update includes some new notation, new notes on Compatibility Properties, and other changes. The UTC is seeking public feedback on these changes.

Resolution: Closed 2004-11-23. The proposed update was approved with changes from feedback and will be published.

49 Proposed Update UTS #6 A Standard Compression Scheme for Unicode 2004.11.08

This is a proposed update to a previously approved Unicode Technical Standard. This standard describes a compression scheme (SCSU) mainly intended for use with short to medium length Unicode strings. A number of changes and clarifications have been made in the text, and the UTC is seeking public feedback on these changes.

Resolution: Closed 2004-11-23. The proposed update was approved and will be published.

48 Definition of "Directional Run" 2004.11.08

A definition of "directional run" is proposed for inclusion in UAX #9 The Bidirectional Algorithm. The UTC is seeking public feedback on this definition. See the background document for details.

Resolution: Closed 2004-11-23. The UTC decided to make the definition of "directional run" be the same as "level run" in UAX #9. An updated draft will be posted later.

47 Changes to default collation of Latin in UCA 2004.11.08

In collation, searching, and matching according to the Unicode Collation algorithm, the 10 characters Æ, Ǽ, Ǣ, Đ, Ð, Ħ, Ł, Ŀ, Ø, Ǿ (and their lowercase forms) currently have primary (base letter) differences from the letters A, D, H, L, and O respectively. There is a proposal before the UTC to change these to have secondary (accent) differences from AE, D, H, L, O, respectively. We would welcome feedback on this issue -- pro or con.

Arguments for the change are in the background document. We expect to add the contrary point of view to that document.

Resolution: Closed 2004-11-23. The UTC accepted changes for the ten characters and their lower case counterparts.

46 Proposal for Encoded Representations of Meteg 2004.11.08

In some Biblical Hebrew usage, it is considered necessary to distinguish how the meteg mark positions relative to a vowel point: to the left of the vowel, or to the right; or, in the case of a hataf vowel, between the two components of the hataf vowel. A solution for this has been proposed using control characters, including the zero width joiner and non-joiner characters. This public-review issue is soliciting feedback on this proposed solution.

Resolution: Closed 2004-11-23. The proposal was approved and will be documented in Unicode 4.1.

45 Bidi Category of Narrow No-Break Space 2004.11.08

Should the Bidi category of Narrow No-Break Space (NNBSP, U+202F) be changed from "WS" to "CS", in analogy to No-Break Space U+00A0? The reason for the change is that in all scripts but Mongolian it acts like ordinary NBSP, except for its width. In Mongolian it may be recognized in shaping. (Note, there is no separate background document for this issue.)

Resolution: Closed 2004-11-23. The proposal was approved and the category changed. This will be documented in Unicode 4.1.

44 Bidi Category of Fullwidth Solidus 2004.11.08

Unicode 4.0.1 changes the Bidi Category U+002F SOLIDUS from "ES" to "CS" but leaves U+FF0F FULLWIDTH SOLIDUS as category "ES". U+FF0F FULLWIDTH SOLIDUS should probably have the same bidi class as its regular sibling. The UTC proposes to make this change for Unicode 4.1. (Note, there is no separate background document for this issue.)

Resolution: Closed 2004-11-23. The proposal was approved and the category changed. This will be documented in Unicode 4.1.

43 Proposed Update UAX #24 Script Names 2004.11.08

This is a proposed update to a previously approved Unicode Standard Annex. This annex provides an assignment of script names to all Unicode code points. This information is useful in mechanisms such as regular expressions and other text processing tasks. The proposed update makes several substantial changes to the previously approved annex.

Resolution: Closed 2004-11-23. The proposed update was approved and will be published.

42 Proposed Draft UAX #34 Unicode Named Character Sequences 2004.11.08

This proposed annex specifies sequences of characters that may be treated as single units, either in particular types of processing, in reference by standards, in listing of repertoires (such as for fonts or keyboards), or in communicating with users.

Resolution: Closed 2004-11-23. The proposed draft will be updated with editorial feedback and advanced to "Draft" status.

41 Encoding of INVISIBLE LETTER 2004.11.08

UTC is seeking feedback regarding a proposal to encode "INVISIBLE LETTER" to serve as an unambiguous base letter for combining marks in isolation. The character properties would be specifically designed to aid in processing. This proposed letter might also be used to correspond to the "INV" letter in ISCII in some conversion scenarios, but its intent isn't exactly the same as that character. See the above-linked document for details.

Resolution: Closed 2004-11-23. The proposal was rejected by UTC.

40 Encoding of Latin Capital and Small Letter "At" 2004.11.08

LATIN CAPITAL LETTER AT and LATIN SMALL LETTER AT are used as orthographic characters in the Koalib language of Sudan. They are typically used for Arabic loan words. Although similar in appearance to COMMERCIAL AT, LATIN SMALL LETTER AT should have different character properties. The main concern is the similarity in appearance of LATIN SMALL LETTER AT to COMMERCIAL AT. There are potential implications for Internet protocols that use @. The question for reviewers is: Should the UTC accept LATIN CAPITAL LETTER AT and LATIN SMALL LETTER AT?

Resolution: Closed 2004-11-23. The UTC concluded that there is no compelling evidence of usage to date. Furthermore, the UTC will not change the properties of the existing @ sign U+0040 to be a letter.

39 Draft Unicode Technical Standard #31 Identifier and Pattern Syntax 2004.11.08

An updated draft of UTS #31 "Identifier and Pattern Syntax" is available at the above link. This draft has new conformance information as well as a new section on Normalization and Case and other changes. This document has implications for programming languages, regular expressions, and scripting languages. An update of this document incorporating minor changes was posted 2004/10/19.

Resolution: Closed 2004-11-23. The draft will be updated with editorial feedback and published.

38 Draft Unicode Technical Report #30 Character Foldings 2004.08.03

An updated draft of UTR #30 "Character Foldings" is available at the above link. This update also provides a new set of draft data files for several types of character foldings. The Unicode Technical Committee especially seeks review of the data files.

Resolution: Closed 2004-06-24. The draft will be updated with editorial feedback and published. (Due to various factors, publication did not actually occur. On August 14, 2008, the UTC decided to rescind its approval for publication and put the draft of UTR #30 into withdrawn status. See UTC action 116-C8.)

37 Clarification of the Use of Zero Width Joiner in Indic Scripts 2004.08.03

There are some inconsistencies in the use of ZERO WIDTH JOINER (ZWJ) in a number of Indic scripts which are outlined in the accompanying review document. This proposal intends to rectify these problems, clarifying how the ZERO WIDTH JOINER is to be applied in scripts, and consolidating common mechanisms for equivalent problems that exist in several scripts. The scope for what is proposed covers Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada and Malayalam.
The question for reviewers is: Should the UTC adopt a model in which ZWJ precedes Virama, as proposed in section 7 of the review document?

Resolution: Closed 2004-08-24. UTC accepted the proposal and will create an Indic conjoining behavior model.

36 Draft Unicode Technical Report #30 Character Foldings 2004.06.08

An updated draft of UTR #30 "Character Foldings" is available at the above link. This update also provides draft data files for four types of character foldings. The Unicode Technical Committee especially seeks review of the data files.

Resolution: Closed 2004-06-23. The draft will be updated with editorial feedback and published.

35 Encoding of LATIN SMALL LETTER C WITH STROKE as a phonetic symbol 2004.06.08

At the February 2004 meeting of the Unicode Technical Committee, a proposal was considered to encode the phonetic symbol LATIN SMALL LETTER C WITH STROKE. Some reservation was expressed on the part of some committee members, however, due to potential legacy encoding issues. A decision was made to give tentative approval of this character, but to prepare a public review issue to elicit feedback on the pros and cons of encoding this character.

Resolution: Closed 2004-06-23. The UTC decided to encode the this character. For information on the progress of encoding and balloting, please see the Proposed New Characters: Pipeline Table.

34 Draft UTS #35 Locale Data Markup Language, Version 1.2 2004.09.01

Version 1.2 is now under development, incorporating changes decided on by the CLDR technical committee. The latest working draft of version 1.2 is located at working draft CLDR 1.2. Feedback should be submitted via ~~the CLDR bug database~~, not with the reporting form.

Resolution: Closed 2004-11-09. The final 1.2 version of LDML was released.

33 ~~UTF Conversion Code Update~~ 2004.08.03

The C language source code example for UTF conversions (ConvertUTF.c) has been updated to version 1.2 and is being released for public review and comment. This update includes fixes for several minor bugs. The code can be found at the above link.

Resolution: Closed 2004-08-24. The code will be updated with feedback received and published.

32 Proposed Update UTR #23 Character Property Model 2004.06.08

This is a new draft for the Character Property Model, revising some definitions and extending the discussion of stability as well as override of properties.

Resolution: Closed 2004-06-23. The draft will be updated with editorial feedback and published.

31 Cantonese Romanization 2004.08.03

The sources for the Unihan database use multiple competing romanizations of Cantonese, while the Unihan database uses yet another romanization. We feel that there is no good reason for Unicode to contribute to this confusion, so we plan to adopt a single, standard Cantonese romanization for use throughout the Unihan database.

Resolution: Closed 2004-08-24. UTC decided to go ahead with the change in the next update of the Unihan database.

30 Bengali Khanda Ta 2004.06.08

The description of khanda ta in section 9.2 of Unicode 4.0 and in one of the current Indic FAQs assumed a particular understanding of expected behaviors rather than stating those expectations explicitly. Due to certain wording and an atypical use of ZERO WIDTH JOINER, some implementers have been misled about the behaviors related to khanda ta that were assumed.

In the course of investigating this issue, input was received suggesting that the atypical use of ZERO WIDTH JOINER was problematic, and that a different encoded representation for khanda ta should be adopted.

Alternate representations for khanda ta are described and evaluated in the review document. It is proposed that the existing representation specified in section 9.2 be retained, but that the description in the Standard be revised to remove any ambiguity and potential for misunderstanding.

Resolution: Closed 2004-06-23. The UTC decided to encode the khanda ta as a separate character in a future version of the Unicode Standard. For information on the progress of encoding and balloting, please see the Proposed New Characters: Pipeline Table.

29 Normalization Issue 2004.06.08

There is a problem in the language of the specification of Unicode Standard Annex #15: Unicode Normalization Forms for forms NFC and NFKC. A textual fix is required to make normalization formally self-consistent. The fix will not have an impact on real data found in practice (with the possible exception of test cases for the algorithm itself), because the affected sequences do not constitute well-formed text in any language. Details, cases, and recommendations can be found in the review document.

Resolution: Closed 2004-06-23. The UTC decided to implement the recommendation in the review document, and a draft of Unicode Standard Annex #15: Unicode Normalization Forms incorporating the changes will be updated and published.

28 BIDI Boundary_Neutral Property Value 2004.02.04

The BIDI property value BN is currently aligned with the General Category Value Format_Character (Cf), minus, the BIDI specific format characters (LRM, RLM, RLE, LRE, RLO, LRO, PDF). The intent of the BN property is to allow the BIDI algorithm to ignore invisible, irrelevant characters when determining the ordering of the visible characters. The proposal is to align the BN property with Default_Ignorable_Code_Point property (DICP) instead of Cf, minus again the BIDI specific characters.

Resolution: Closed 2004-02-12. Change the bidi properties of the following code points for Unicode 4.0.1:
          U+00AD SOFT HYPHEN from ON to BN
          All noncharacters from L to BN
          All unassigned code points with the Default_Ignorable_Code_Point property from L to BN
                (U+2064..U+2069, U+FFF0..U+FFF8, U+E0000, U+E0002..U+E001F,
                U+E0080..U+E00FF, U+E01F0..U+E0FFF)
          Annotation characters U+FFF9..U+FFFB from BN to ON

27 Joiner/Nonjoiner in Combining Character Sequences 2004.01.27

Unicode 4.0 describes the structure of Khmer syllables, saying that they may contain an interior ZWJ. There is a problem with this that needs to be resolved in 4.0.1, because some of the characters later in the syllable can be combining characters. This paper describes a proposal with which to fix this problem. As a part of the proposal, a choice has to be made among two alternatives.

Resolution: Closed 2004-02-12. The UTC decided to allow ZWJ and ZWNJ in combining character sequences, but not to change their general category. The interpretation of joiner/nonjoiner between two combining marks is not yet defined. Minimal changes to definitions D14 and D17 of the standard will be made for Unicode 4.0.1. UTC also decided to issue a proposed update of UTS #18 Unicode Regular Expression Guidelines that makes appropriate changes.

26 Update properties for Ethiopic and Tamil non-decimal digits 2004.01.27

Decimal numbers are those using in decimal-radix number systems. In particular, the sequence of the ONE character followed by the TWO character is interpreted as having the value of twelve. We have gotten feedback that this is the not the case for Ethiopic or Tamil. Details of the affected codepoints are in the above-linked document.

Resolution: Closed 2004-02-12. UTC decided to encode Tamil Digit Zero as "Nd" and the other digits will remain "Nd" reflecting current practice. (This character's status will be tracked in the "Pipeline" list of future additions to the standard.) The Ethiopic digits U+1369 - U+1371 will be changed in Unicode 4.0.1 from general category "Nd" to "No" and the numeric type will be changed to synchronize.

25 Proposed Update UTR #17 Character Encoding Model 2004.08.03

This is an updated draft of the Character Encoding Model, reflecting the feedback received until 2004-03-25 and clarifying the description of Character Encoding Schemes.

Resolution: Closed 2004-08-24. The proposed update progressed to final status and will be posted.

24 Proposed Update UAX #9 Bidirectional Algorithm 2003.10.27

A proposed update of Unicode Standard Annex #9 Unicode Bidirectional Algorithm is available at the link above.

Resolution: Closed. This review item has now been included in the Unicode 4.0.1 beta.

23 Terminal Punctuation Characters 2003.10.27

In Unicode 4.0.1, the new property Sentence_Terminal will be added. This consists of characters that terminate a sentence; in particular, a sentence (unless quoted) should not span one of these characters based on UAX #29 (Text Boundaries). The above-linked document provides a comparison of this property with the existing Terminal_Punctuation and Other_Punctuation, so that people can provide feedback as to whether any characters should be moved from one category into another.

Resolution: Closed. "Sentence_Terminal" will be changed to "STerm" in the UCD, Proplist.txt and PropertyAliases.txt. The change will be implemented for Unicode version 4.0.1.

22 Collation Mechanism for Syllabic Scripts 2003.10.27

In UTS #10: Unicode Collation Algorithm, there is discussion of a mechanism for handling syllabic scripts, notably Korean Hangul. The alternative mechanism discussed in the above-linked document is proposed to allow the UCA and tailorings to deal with syllabic collation. The goal is for this mechanism to be very lightweight, and thus easy for implementations to implement without impacting the performance of other characters.

Resolution: Closed. The collation mechanism for syllabic scripts will be documented in an update to UTS #10.

21 Changing U+200B Zero Width Space from Zs to Cf 2003.10.27

There have been persistent problems with usage of the U+200B Zero Width Space (ZWSP). The function of this character is to allow a line break at positions where it normally would not be allowed, and is thus functionally a format character with a general category of Cf. This behavior is well documented in the Unicode Standard, and the character not considered a Whitespace character in the Unicode Character Database. However, for historical reasons the general category is still Zs (Space Separator), which causes the character to be misused. ZWSP is also the only Zs character that is not Whitespace. The general category can cause misinterpretation of rule D13 Base character as allowing ZWSP as a base for combining marks.

The proposal is to change the general category of U+200B from Zs to Cf.

Resolution: Closed. The general category of U+200B will be changed from Zs to Cf in Unicode version 4.0.1.

20 Draft UTR #31 Identifier and Pattern Syntax 2004.06.08

A draft of Unicode Technical Report #31 Identifier and Pattern Syntax is available at the link above. Note: this new draft was posted January 26, 2004.

Resolution: Closed 2004-06-23. The draft will be updated with editorial feedback and published as a Unicode Technical Report.

19 Proposed Draft UTR #30 Character Foldings 2003.10.27

A proposed draft of Unicode Technical Report #30 Character Foldings is available at the link above. Please provide feedback to the authors by the deadline for comments.

Resolution: Closed. The proposed draft progressed to the status of Draft Unicode Technical Report.

18 Draft UTR #23 The Unicode Character Property Model 2003.10.27

A proposed draft of Unicode Technical Report #23 The Unicode Character Property Model is available at the link above. Changes in the document are marked with yellow formatting. This will be finalized after the next UTC meeting.

Resolution: Closed.

17 UTS #18 Unicode Regular Expressions 2003.10.27

It is proposed to change the status of UTR #18 from a Unicode Technical Report (UTR) to a Unicode Technical Standard (UTS). The draft of the proposed UTS is is available at the link above. Changes in the document are marked with yellow formatting. This will be finalized after the next UTC meeting.

Resolution: Closed. The change was made to a UTS.

16 Update to UAX #29 Text Boundaries 2003.10.27

A proposed update for Unicode Standard Annex #29 Text Boundaries is available at the link above. Changes in the document are marked with yellow formatting. This will be finalized and included as part of the Unicode 4.0.1 release.

Resolution: Closed. The UTC decided include some minor editing to take into account public feedback.

15 Changing General Category of Braille Patterns to "Letter Other" 2003.10.27

The UTC has received requests to change the general category of the Braille characters to be "Letter other" (Lo) rather than "Symbol other" (So), and is seeking comments and information on the Braille processing model and existing implementations to help with this decision.

The Braille pattern symbols are encoded from U+2800 through U+28FF, and are discussed in the Unicode Standard 4.0, chapter 14 section 9. The presumption until now in Unicode has been that the Braille characters are essentially "final form" characters; that the source text would be in other scripts, and these would be used for presentation of that source text. Under that model, the characters would be better characterized as symbols; in particular, they would not be suitable for program identifiers.
The effect of the proposed change would be for implementations to treat the Braille pattern symbols as letters rather than symbols for various textual processes. There is a particular interaction with the proposed XML 1.1 categorizations for element names that the committee is concerned with, and is especially interested in feedback regarding related issues.

Resolution: Closed. The UTC decided not to change the general category of the Braille characters. However, the Bidi category of the Braille characters was changed to be strong left-to-right (L), as a result of this review.

14 Unicode Collation Algorithm 4.0.0 Beta 2003.08.26

The primary goal of this release of the Unicode Collation Algorithm is to synchronize the repertoire of strings for collation (sorting) with the repertoire of Unicode 4.0. For future versions of the Unicode Standard that add characters, there will also be versions of the UCA tables with synchronized repertoire. A small number of additional changes have been made for consistency in treatment of new and old characters; however, other changes await working with SC22/WG2 so that future versions of ISO 14651 and UCA can be synchronized. The relevant data file is this version of allkeys.txt.

Resolution: Closed.

13 Unicode 4.0.1 Beta 2004.01.27

The beta period for Unicode 4.0.1 is open. Detailed information is available on the 4.0.1 beta page. This release also includes three proposed updates to Unicode Standard Annexes (UAXes):

Proposed Update UAX #9 Bidirectional Algorithm
Proposed Update UAX #11 East Asian Width
Proposed Update UAX #29 Text Boundaries

The above list of proposed updates, the Unicode 4.0.1 data files and the 4.0.1 beta page may be updated during the beta period. The purpose of the update to UAX #11 is to clarify the concept of Ambiguous width.

In addition, feedback is welcome on the UAXes that do not currently have proposed updates posted.

Resolution: Closed 2004-02-12.

12 Terminal Punctuation Characters 2003.08.18

In Unicode 4.0.1, the new property Sentence_Terminal is being added. This property is to be used in the default sentence boundaries in UAX #29 (Text Boundaries), instead of a list in the body of that document (under the heading "Term"). The Unicode Technical Committee is seeking feedback on the common usage of certain punctuation characters; especially feedback from those familiar with non-Latin writing systems, including Arabic, Armenian, Syriac, Devanagari, Myanmar, and so on.

Resolution: Closed. In addition to the characters discussed in the above document, the following characters are also included in Sentence_Terminal:
     U+05C3 HEBREW PUNCTUATION SOF PASUQ
     U+0F08 TIBETAN MARK SBRUL SHAD
     U+0F0D TIBETAN MARK SHAD
     U+0F0E TIBETAN MARK NYIS SHAD
     U+0F0F TIBETAN MARK TSHEG SHAD
     U+0F10 TIBETAN MARK NYIS TSHEG SHAD
     U+0F11 TIBETAN MARK RIN CHEN SPUNGS SHAD
     U+0F12 TIBETAN MARK RGYA GRAM SHAD

11 Soft-Dotted Property 2003.08.18

The Unicode Standard has the principle that if an accent is applied to an i or j, the base character loses its dot. Such characters are called "soft-dotted". The UTC proposes to extend this property to a number of characters that do not currently have the property. The accompanying document lists the characters.

Resolution: Closed. The UTC decided to resolve the issue by adding the list of proposed additional soft-dotted characters (in the above document), excluding "ij" small ligature U+0133.

10 Interlinear Annotation Characters 2003.08.15

Change the General Category for the Interlinear Annotation Characters from Cf to Po (Punctuation Other), and thereby change the status to not be Default_Ignorable_Code_Points. In addition to the document linked above, some explanation from the standard about Default Ignorable codepoints is available.

Resolution: Closed. The UTC decided to resolve this issue not by changing the general category, but by making them non-default-ignorable, and using an exclusion list. This also calls for adding a documentary note that not all characters with dotted-box glyphs have the general category Cf.

9 Bengali Reph and Ya-Phalaa 2003.10.27

Resolve an ambiguity with regard to handling of reph and ya-phalaa in Bengali implementation by using ZWNJ between Ra and Halant (Virama).

Resolution: Closed. After consideration of public review comments, the UTC decided to adopt the solution suggested in the attached resolution document.

8 Math digits 2003.06.01

It is proposed to give the mathematical digits (U+1D7C9 .. U+1D7FF) the general category of "No" rather than "Nd", and to delete the value of field 6 from those characters in the UCD (UnicodeData.txt file; note: the first field is numbered 0).
This is in recognition of the fact that these digits are most commonly not used as part of decimal numbers, but are used as variables or other mathematical symbols.
This proposal is intended to change their Numeric_Type from nt=de (Decimal) to nt=di (Digit), thereby aligning them with the superscript and subscript digits which were just changed, as special cases, and distinguishing them from the various sets of decimal digits per se (nt=de and gc=Nd). As a result of the proposed change, category Nd would be reserved for decimal digits used in ordinary decimal numbers.

Resolution: Closed. The UTC sees no reason to change the numeric values assigned to the Math Digits 1D7C9..1D7FF.

7 Tailored normalization forms 2003.06.01

The UTC is considering allowing limited tailoring of normalization forms. This would involve excluding certain specified sets of characters from decomposition, notably the CJK compatibility characters (which are actually canonical--not compatibility--decomposables).
Possible advantages are that it allows the graphic variations to be preserved; possible disadvantages are interoperability problems with different variants of normalization forms. Two documents discussing this issue are posted HERE and HERE.

Resolution: Closed. In view of the many problems which turned up when attempting to design a tailoring mechanism for normalization, the UTC has decided not to add a mechanism for tailored normalizations.

6 Unicode 4.0 Beta data 2003.03.21

The ~~beta review~~ of the Unicode 4.0 data files is open for public comment. We strongly encourage implementers to download these files and test them with their programs, well before the end of the beta period.The comment period ends March 21, 2003. However, comments that are not editorial will need to be reviewed by the Unicode Technical Committee (UTC). Comments received by March 3, 2003 will be in time to be reviewed at the next meeting of the UTC.

Resolution: Closed.

5 Object Replacement Char 2003.06.01

Whether to treat U+FFFC Object Replacement Character and the Interlinear Annotation Characters as "default ignorable" or to have a default visible representation.

Resolution: Closed. The UTC determined that the Object Replacement Character and the interlinear annotation characters should not be given the Default_Ignorable property.

4 Sharp S collation weight 2003.02.14

The default weighting in the UCA for the following character should have a tertiary difference from "ss" instead of a secondary difference.
    U+00DF (ß) LATIN SMALL LETTER SHARP S

Resolution: Accepted. This change will be incorporated into a subsequent revision of UTS #10, the collation standard. However, DIN takes a different approach so we are attempting to contact them to get more information.

3 Connecting Characters 2002.11.05

Add language indicating that glyphs for the following should normally be designed to connect.
    U+2013 (–) EN DASH
    U+2014 (—) EM DASH
[Related Internal Document L2/02-277]

Resolution: Rejected. UTC now considers it inadvisable add such language, given widespread implementation.

2 Khmer character deprecation 2002.11.05

Deprecate the following characters
    U+17A3 () KHMER INDEPENDENT VOWEL QAQ
    U+17D3 () KHMER SIGN BATHAMASAT
Mark the following as being discouraged:
    U+17B4 () {KHMER VOWEL INHERENT AQ
    U+17B5 () {KHMER VOWEL INHERENT AA
    U+17A4 () {KHMER INDEPENDENT VOWEL QAA
    U+17D8 () {KHMER SIGN BEYYAL

Resolution: Accepted. This change is to be incorporated into Unicode 4.0.

1 Lang tag deprecation 2003.02.14

Deprecate the Plane 14 Language Tags

Resolution: Closed. There is no change in status; additional clarification has been added to Unicode 4.0 text.