Re: Repertoire, encoding, and representation (Was: Charsets + encoding + codesets)

From: Andrea Vine (
Date: Thu Oct 09 1997 - 19:49:24 EDT

Keld J|rn Simonsen wrote:
> Andrea Vine writes:
> > Glenn, Keld, Ken, Yves, et al,
> >
> > I have a question for all of you regarding this terminology debate. What is the goal
> > here? Are you trying to nail down precision in the terms for use in future
> > documents?
> Yes, that was at least my intention.
> > The reason I ask this is that I'm trying to understand the goal of the documents. I
> > have in the past assumed that they are written to explain the standards to the people
> > who need to implement them. Having attempted to read a few, I think my assumption to
> > be incorrect.
> >
> > Terminology precision to the degree being discussed here will not change the
> > readability of the standards documents for me. Simple, straightforward prose and
> > specific examples will.
> >
> > The clarity of the writing can be damaged when trying to be precise. There is a
> > point of diminishing return when readability is sacrificed for precision. The less
> > readable a standard is, the more likely implementers won't be able to follow it.
> Well, there may be a difference between what is discussed beforehand
> between experts discussing terminology, and then how it
> is expressed in the actual standards text.
> Keld

Hmm, "may be" and "experts". The actual standards texts being discussed are already
written. There may or may not be a difference in what's discussed in this dl vs.
what gets written into the next revision of the standards being discussed, and

What is the sense of "expert" here? An expert on Unicode implementation? An expert
on standards creation? An expert on words? I'm just asking you (in the plural
sense) to remember what happens when someone is trying to read the standard. True,
some of us are bookish types who insist on understanding the nature of the
terminology. But I suspect the vast majority of the readers and implementers use an
approach along the lines of "It says here 'blah blah blah character set' I know what
a character set is, so I don't have to look it up under Standard #nnn." In other
words, the precise terminology is more useful in judging the implementation, rather
than creating it.

Are the standards written to facilitate conformant implementation, or to facilitate
the judging of conformancy? The difference lies in the audience.

What prompted me to make this observation is not just the experience of slogging
through (pardon the slang) standards, but looking at the terms being discussed. Will
defining the precise meaning of character, character set, encoding, coded character
set, character encoding, etc., clarify the standard? To whom?

There is an added level of complexity in that for many of the implementers I know,
English is a second language. When I read a technical description in a second
language, precision to the level being discussed here is lost on me. I imagine that
is the experience of many. I realize this is not the ideal situation, but it is
reality, at least around Silicon Valley.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT