Re: tips on writing character proposal

From: Asmus Freytag <>
Date: Wed, 09 Nov 2011 22:00:45 -0800

On 11/9/2011 6:08 PM, Mark E. Shoulson wrote:
> On 11/09/2011 03:58 PM, Larson, Timothy E. wrote:
>> Hello!
>> I'm new here, but have already read some of the online documentation
>> for proposing new characters. I'm still a bit unsure how to go about
>> it. Or even who can do it. Can individuals submit ideas, or do you
>> need to be the representative of some agency or group? How much
>> supporting background information is deemed sufficient? Where do I
>> find details (more than just the pipeline table) of current pending
>> proposals?
> There are others here who will throw even more cold water on some of
> these ideas, but I can suggest that you read
> for some ideas
> about what is encodable and what isn't. You'll probably find plenty
> of exceptions, but it's a start.


Before you get totally discouraged, I'd like to point out that there are
few "open and shut" cases in character encoding. Chances to get your
proposed characters improver, the better the use case and the better the
documented examples of actual use (usually in print or in examples that
"should" be convertable to print). The fact that you think a character
is "missing" is evidence that there's at least one potential user.

Your task, in writing a proposal, would be to document that you are not
alone (far from it) and that these symbols are used in text(s) on equal
footing with other symbols. Doing the research and writing a proposal
does take some work, and critics will be hovering to point out all
shortcomings. But that should help improve your proposal.
>> Here are my ideas in very abbreviated form. If these are
>> non-starters from the beginning, I'd as soon know it sooner rather
>> than later.
>> These first several self-descriptive shapes are simply things I've
>> seen suggested and wished for online for some time.
> These might well be non-starters. Think about the first question
> you'd be asked: Why should these be encoded? Is there any reason we
> should be considering these symbols "plain text" that need to be
> encoded as such? Or is it just because they're common simple
> geometric symbols? While it is true that a lot of simple geometric
> symbols have been encoded, it generally has not been *because* they
> are simple geometric symbols, but rather because they were encoded in
> some other standard once before, or because they are used as plain
> text in some settings.

Before you see this as a definite answer, let me give you a suggestion
of a different opinion.

A common usage of these symbols in text is in "non-verbal" speech
bubbles in cartoons. While these bubbles may look hand-drawn, they are
very often actually typeset. The one exception being just those strings
of symbols.

Since, in the examples that I am thingking of, they are presented as
text and their layout (on a line) is in no way different than text
presentation, it's not possible to simply rule these out categorically.

When symbols, however arbirtrary, can be demonstrated as being used as
part of writing, there's no good rationale to refuse their encoding.
Doing so would simply send the message that arbitrary symbols are fine
if they occur in just a subset of (more formal, e.g. mathematical) texts
or on electronic platforms, but not elsewhere. That seems in violation
of precedent and in violation of the universal scope of the standard.

Now, you may not find examples of all types of spiral. Unless logically
required by formal notation, I would, in that case, propose only those
that can be found as in use. "Completion of the set" can be an argument
in favor of encoding, but not everything is member of a set worth

>> The next several are a response to a perceived deficiency in
>> standardization of religious symbols. I suggest starting these
>> cultural symbols at 2BC0 to distinguish them from the
>> generic/geometric symbols earlier in the block. Very brief
>> description/background given.
>> 2BC0 ICHTHYS ="Jesus fish", symbol used by ancient Christians for
>> identification, denotes non-denominational and inter-denominational
>> Christianity in modern times
>> 2BC1 TRIQUETRA =three-lobed vesicae piscis, used in Christianity
>> and ancient/modern paganism
>> 2BC2 MENORAH =7-branched temple lamp, ancient symbol of Judaism
>> 2BC3 HANUKIAH =9-branched Hanukkah lamp
> Apply the same question. What makes these symbols plain text? To be
> sure, there are other religious symbols in Unicode, particularly in
> the MISCELLANEOUS SYMBOLS and DINGBATS blocks, but those are mainly
> there because they were formerly encoded in, say, Zapf Dingbats, or
> are commonly used as map symbols. (You might actually be able to find
> some support for these, though, but don't ask me where.)

I think these are great research candidates. I concur with the skeptics
here that the mere existence of a symbol (with well established function
and appearance) is not normally sufficient for encoding. You do need to
demonstrate that each is used in text and, at the minimum, that the
range of usage is the same as for other like symbols, but preferably you
have a "smoking gun" of some example where they show up in running text.



your objection
> It's a very common mistake, in coming to Unicode, to think "Oh, it
> would be *so great* if these things were encoded!" But Unicode isn't
> about encoding what would be neat to encode. It's about encoding
> _text_, (including things that have been encoded before).
is well taken, but I think these are not ipso facto specious
suggestions. At least not when you compare them to "invented" symbols by
various proponents. On the contrary, they seem rather focused and sober
suggestions to extend some of the existing sub-sets of symbols by
characters that one can expect to be analogous in usage. I'm entirely
comfortable with considering these worthy of a measured response and
detailed discussion rather than an off-hand dismissal.

For a formal decision there will have to be a bit of research, and a
formal proposal. That goes without saying.

Received on Thu Nov 10 2011 - 00:07:30 CST

This archive was generated by hypermail 2.2.0 : Thu Nov 10 2011 - 00:07:44 CST