From: Uriah Eisenstein (firstname.lastname@example.org)
Date: Mon May 10 2010 - 07:32:25 CDT
Thank you for the detailed answer, Mr. Freytag, I will consider then
submitting at least an initial proposal (will probably take a few weeks).
I'll try to contact participants in some projects which make use of
character decompositions; although, I need to think if such character
fragments would be useful in themselves for exchange of information, rather
than functioning as convenient building components for other characters.
Is there anywhere I could find the justifications for adding the CJK Radical
Supplement characters, or were these incorporated into Unicode as part of
Also, are the IDSs used internally by the IRG available anywhere public? I
know these are not an official part of the Unicode standard, but they would
make a nice use case :)
On Sat, May 8, 2010 at 11:40 PM, Asmus Freytag <email@example.com> wrote:
> On 5/8/2010 11:44 AM, Uriah Eisenstein wrote:
>> I've gone through the policies of submitting new characters and scripts
>> and they don't look encouraging :) But neither do they seem to reject the
>> idea of character fragments out of hand, as opposed to the reverse case -
>> characters which can be expressed using existing characters and combining
>> marks. In fact, the CJK Radicals Supplement block and the Hangul Jamo both
>> contain character fragments, in a way. But I suppose these already existed
>> in national standards rather than introduced by Unicode.
>> In any case, examples I've seen of proposals cite experts and provide font
>> makers, neither of whom I have contact with. So I guess I'll drop it for
>> now, and hope that if someone takes it up I'll see it on the mailing list.
> While a font is ultimately required for a proposal to become adopted, it
> shouldn't be a bar to formally raising the issue for initial consideration.
> Oncesomething is considered potentially acceptable, there's enough time to
> come up with fonts (for the purpose of printing charts) before the
> committees need to vote on final approval. Proposals can take years from
> initial consideration to publication....
> Your suggestion was that these fragments need to be enumerated for various
> purposes in software and that having a standard enumeration is beneficial.
> If you can document and support that assertion, I would encourage you to put
> it on record.
> Doing so would allow a discussion of whether a standard enumeration is
> indeed useful enough to encur the cost of standardization.
> In some ways, this would not be a run-of-the-mill character encoding
> proposal, because you are not asserting that these fragments need encoding
> for the purpose of directly expressing text. While that is the primary
> purpose of character encoding, there are purposes that are ancillary to
> this, that a universal character encoding such as Unicode must encompass.
> There is certainly some precedent for character codes that aren't limited
> to the primary purpose I mentioned, but, because they don't represent a
> standard situation, one needs to carefully argue why such uses need to be
> covered by standardization and if so, why doing that as character codes is
> That is different from the more usual task to document that an entity
> occurs in written or printed documents.
> The problem is, unless you actually put down all the details in a coherent
> proposal it's hard to judge correctly what the situation is. When you raise
> the question informally, all anyone can tell you is that an exceptional
> request is one that needs exceptional justification, which, while certainly
> correct, doesn't exacatly help you or anyone to evaluate whether your
> proposal would meet the required level and type of justification.
>> On Sun, May 2, 2010 at 3:06 PM, Uriah Eisenstein <
>> firstname.lastname@example.org <mailto:email@example.com>> wrote:
>> Not exactly, but I suppose such Hanzi fragments could be sued for
>> similar purposes - e.g. looking up characters by components, where
>> the available components may include non-character fragments. Some
>> fragments may be useful for IME purposes, but probably not all.
>> On Sat, May 1, 2010 at 8:57 PM, Edward Cherlin <firstname.lastname@example.org
>> <mailto:email@example.com>> wrote:
>> 2010/4/28 John H. Jenkins <firstname.lastname@example.org
>> > No. You could certainly write up a proposal and submit it
>> to the UTC.
>> > Should the UTC feel the idea has merit, it would then move
>> it on to WG2
>> > and/or the IRG.
>> > The main problem here is that there is a very strong desire
>> to limit
>> > ideograph encoding to attested and documentable forms.
>> Anything which does
>> > not exist in actual texts is not likely to be well-regarded.
>> I had the idea some years ago of writing up a proposal to
>> encode the
>> hanzi fragments used in Cangjie Shurufa IMEs. These fragments
>> are used
>> extensively in dozens of howto books on keyboarding in
>> Cangjie. This
>> includes the pieces (mostly real characters, with some
>> radicals) used
>> on keyboard labels, and the common forms they stand for. I
>> didn't get
>> any interest from the Cangjie development community or the
>> authors of
>> a book on Cangjie that I have, so i abandoned the idea.
>> Uriah, is this the sort of thing you have in mind?
>> > Similarly, the
>> > UTC has a strong preference not to encoding anything which
>> isn't in actual
>> > use. Proposals to encode characters because they would be
>> useful if encoded
>> > even though they aren't actually being used right now are
>> generally looked
>> > on with disfavor.
>> > 在 Apr 28, 2010 12:03 PM 時， Uriah Eisenstein 寫到：
>> > Hello,
>> > My question is about common components of CJK Ideographs
>> which are not
>> > encoded as independent Han characters (and perhaps indeed
>> aren't). A good
>> > example is the right-hand part of the character 漢 itself:
>> it is a distinct
>> > component appearing in multiple other characters, but is not
>> encoded to the
>> > best of my knowledge. The same goes for the top part of 鳥
>> and 島, the
>> > surrounding part of 與 and 興 and several others. My
>> question is whether there
>> > are any plans or discussions for encoding these fragments in
>> > (I haven't found anything about this in mailing list
>> archives; I did find
>> > statements that Unicode does not intend to provide any
>> decomposition data of
>> > Han characters :) And for good reasons. However, such
>> fragments may well be
>> > useful for third-party software dealing with 漢字 glyph
>> generation, lookup by
>> > components etc.)
>> > Thanks,
>> > Uriah Eisenstein
>> Edward Mokurai (默雷/धर्ममेघशब्दगर्ज/دھرممیگھشبدگر ج) Cherlin
>> Silent Thunder is my name, and Children are my nation.
>> The Cosmos is my dwelling place, the Truth my destination.
This archive was generated by hypermail 2.1.5 : Mon May 10 2010 - 07:35:36 CDT