Re: Ideographic Description

From: Kevin Bracey (kevin.bracey@pacemicro.com)
Date: Wed Sep 08 1999 - 11:53:04 EDT


Here goes my attempt at answering this; this'll show everyone whether I
was awake or not during the conference :)

In message <9909081503.AA20801@unicode.org>
          Marco.Cimarosti@icl.com wrote:
          
>
> 1) What is IDS really for? Why has this feature been introduced in
> ISO-10646?
>

It's to allow you some way to describe the character you want even though
it has not (yet) been encoded properly in the UCS.

> 2) Will these addition be integrated in Unicode as well?

Yes, Unicode 3.0
>
> 3) Document [1] explicitly states that an IDS "describes the ideograph
> in the abstract form. It is not interpreted as a composed character and
> does not have any rendering implication." -- OK: pretty rendering of IDSs
> is not *required* to conformant applications, but is it *forbidded*?

It's not required, and it's not forbidden. A composed ideograph of the
correct form would just be a kind of ligature, and implementations are free
to represent sequences of glyphs however they like. For a particular
application you might have a font with the glyph in that substituted it in as
a ligature for an IDS describing it.

>
> 4) Would it be conformant to use an IDS in place of a character already
> encoded within CJK Unified Ideographs?

No. Actually, it's the sort of conformance requirement that would be
impossible to enforce, but it would certainly be frowned upon mightily.

>
> 5) What if one only uses Description Components (DC) form the new
> "Kangxi Radicals" and "CJK Radicals Supplement": would it be possible to
> build valid IDSs for *all* the encoded CJK Unified Ideographs using only
> these elements?

Good question. I'm pretty certain the answer is that you could encode most
of the ideographs, but far from all. Certainly many of the fine details of
the characters would be lost.

>
> 6) Some of the Kangxi radicals (especially those with stroke number >=
> 10) could be expressed with an IDS, using simpler components. Would it be
> considered conformant to make an IDS that "decomposes" a Kangxi radical?

No.

>
> 7) Will ISO/IEC ever publish a list of IDSs for existing CJK Unified
> Ideographs? (I.e. a sort of decomposition mapping file)?
>

I hope not.

> As you may have guessed by my inappropriate terminology, I have absolutely
> no liason with any standard body or committee (oh, well, have been a
> private member of Unicode, but just for one year), and I discovered these
> documents only by chance. However, the possibility of implementing an IDS
> renderer sounds very appealing to me, because it reminds me of an idea -
> that I have been cherishing for a long time - of an 8-bit character set
> for CJK characters, that only encodes the smallest possible set of basic
> "components".
>

Nice in theory, crummy in practice. Each character would require so many
component parts to encode you would end up with something about 3 times
as large as the standard UTF-16 representation, even if you could have
1 byte per component, and the renderer would just end up having to have a map
from each IDS -> correct glyph. You're not going to get typographically
acceptable results by generating the glyphs algorithmically from the IDS.

-- 
Kevin Bracey, Senior Software Engineer
Pace Micro Technology plc                     Tel: +44 (0) 1223 518566
645 Newmarket Road                            Fax: +44 (0) 1223 518526
Cambridge, CB5 8PB, United Kingdom            WWW: http://www.acorn.co.uk/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT