Ideographic Variation Database

Home | Site Map | Search

Content

Introduction

PRI 108: Combined registration of the Adobe-Japan1 collection and of sequences in that collection

Introduction

A submission for the "Combined registration of the Adobe-Japan1 collection and of sequences in that collection" has been received by the IVD registrar. This submission is currently under review according to the procedures of UTS#37, Ideographic Variation Database, with an expected close date of 2007-11-25. This submission is a revision of PRI 98.

At the end of the review period, the submission has been incorporated into the 2007-12-14 version of the IVD, with minor adjustments, as 14,647 registered IVSes.

This page remains available for archival purposes.

Review instructions

Reviewers are encouraged to comment on any aspect of the submissions, but more particularly on:

whether the intent of a proposed collection is appropriately described

whether the glyphic subset corresponding to a proposed sequence is indeed a glyphic subset of the base character for the sequence

whether the proposed sequences are congruent with the scope of their collection, or whether a new collection may be more appropriate

All comments should be sent via the reporting form and will be forwarded to the submitter. The content of the submission may be adjusted during the review period to account for the comments received.

Submission details

The content of this section has been provided by the submitter.

Submitter

Name and address of registrant: Adobe Systems Incorporated, 345 Park Avenue, San Jose, CA 95110-2704 USA

Names and email addresses of representatives: Ken Lunde (lunde@adobe.com) & Eric Muller (emuller@adobe.com)

URL of the web site describing the collection: Adobe Tech Note #5078

Suggested identifier for the collection: Adobe-Japan1

Pattern for the sequence identifiers: CID\+[0-9]+

The Adobe-Japan1 Character Collection

The Adobe-Japan1 Character Collection is widely used as the basis for the glyph complement of fonts for the Japanese market. Successive versions of the collection are identified by appending a supplement number, e.g. Adobe-Japan1-6 (the most current supplement).

Adobe-Japan1-6 enumerates 23,058 glyphs, specifically CIDs 0 through 23057, contains nearly 15,000 kanji (aka, ideographs). The actual number of kanji is 14,664, and are the following CIDs and CID ranges: 656, 1125-7477, 7633-7886, 7961-8004, 8266-8267, 8284, 8285, 8359-8717, 13320-15443, 16779-20316, and 21071-23057.

Of the 14,664 kanji in Adobe-Japan1-6, the following twenty seven are unencoded and lack a parent form, meaning that they cannot be treated through the use of IVSes: 13763, 13782, 14145, 14174, 14278, 15429, 15431, 15434, 20068-20071, 20088, 20096-20097, 20125, 20141, 20149, 20153, 20156, 20176, 20180, 20194, 20204, 20247, 20256 and 20260.

The Adobe-Japan1-6 character collection is defined in Adobe Tech Note #5078.

Adobe-Japan1 IVD Collection & Assignments

The purpose of the proposed Adobe-Japan1 IVD collection is to support in Unicode plain text the distinctions which are made by the Adobe-Japan1 Character collection.

This submission covers the content of Adobe-Japan1-6. Future supplements of the Adobe-Japan1 Character Collection, if any, will be addressed by corresponding future submissions to the IVD.

Click here to view the proposed Adobe-Japan1 IVS collection. Click here to view the proposed Adobe-Japan1 IVS sequences. The second file contains 14,639 lines (excluding the header lines).

Note that all Adobe-Japan1-6 kanji, except those twenty seven pointed out above, are given IVS assignments, including those that have only one form assigned. This is to ensure that each Adobe-Japan1-6 kanji can be uniquely and explicitly identified without referencing their default (IVS-less) encoding, and because kanji may be added in future Adobe-Japan1 Supplements that may be variants of such kanji. As an example, observe the single line for the kanji assigned to U+3405 in the collections file: 3405 E0100; Adobe-Japan1; CID+15387

Also note that the highest IVS that is used in U+E010E. It is used with U+9089, meaning that fifteen kanji are assigned to it, and are to be distinguished through the use of IVSes. Below are all fifteen lines for U+9089:
9089 E0100; Adobe-Japan1; CID+6930

9089 E0101; Adobe-Japan1; CID+13407

9089 E0102; Adobe-Japan1; CID+14241

9089 E0103; Adobe-Japan1; CID+14242

9089 E0104; Adobe-Japan1; CID+14243

9089 E0105; Adobe-Japan1; CID+14244

9089 E0106; Adobe-Japan1; CID+14245

9089 E0107; Adobe-Japan1; CID+14246

9089 E0108; Adobe-Japan1; CID+14247

9089 E0109; Adobe-Japan1; CID+14248

9089 E010A; Adobe-Japan1; CID+14249

9089 E010B; Adobe-Japan1; CID+14250

9089 E010C; Adobe-Japan1; CID+14251

9089 E010D; Adobe-Japan1; CID+14252

9089 E010E; Adobe-Japan1; CID+20233
Adobe-Japan1 CIDs are represented as decimal integers.

Representative Glyph Charts

Representative glyphs for the submitted sequences are available in PDF format, in two sets of charts. Both sets show the sequences indexed by their base character, in code point order. The complete charts (4.4MB) show all the submitted sequences. The partial charts (776KB) show only the characters for which multiple sequences are submitted.

Comparison with the previous submission

This submission is a revision of PRI 98. Here is a summary of the changes:

sequences for the following twenty CIDs have been removed: 13763, 13782, 14145, 14174, 14278, 20088, 20096, 20097, 20125, 20141, 20149, 20153, 20156, 20176, 20180, 20194, 20204, 20247, 20256, 20260.
a sequence has been added for CID 12869 (on U+6CE8)
sequences for the following fourteen CIDs have been changed to use a different base character: 13647, 13691, 13747, 13922, 14061, 14109, 14188, 14189, 20128, 20271, 20290,7725, 7872, 8542.
sequences for the following eleven CIDs have been changed as a result of the other changes, to use the first available variation selector: 1229, 2554, 3281, 6163, 13645, 13833, 13967, 14146, 14175, 14228, 20114.

Comments received

This page may be updated from time to time to inform reviewers of some of the comments received.