Definition of Script etc.

From: Christopher Fynn (
Date: Sun May 30 2004 - 14:29:33 CDT

  • Next message: D. Starner: "Re: Definitio "Sn ofcript" etc. (was: Re: Phoenician & Kharoṣṭhī proposals)"

    Sorry about the garbled Subject line in my previous post. I don't know
    how that happened as the original in my Sent folder looks OK

    Christopher Fynn wrote:

    > John Hudson wrote:
    > ....
    >> I have been thinking today that part of the reason for the debate is
    >> that Unicode has a singular concept of 'script', a bucket into which
    >> variously shaped concepts of writing systems must be put or rejected.
    >> I don't think there is anything conceptually wrong with the idea that
    >> specific instances of a single script might be separately encoded if
    >> there is a need or desire to distinguish them in plain text. It just
    >> happens that Unicode has only one word that can be applied to such
    >> instances, and that is 'script'. It seems clear to me now that what
    >> Unicode calls a script needn't necessarily be what semiticists, or
    >> anyone else, calls a script. A functional Unicode definition of
    >> script might be formed as: a finite collection of characters that can
    >> be distinguished in plain text from other collections of characters.
    > John
    > "Script" is already defined in ISO 10646 as:
    > <<4.35 script: A set of graphic characters used for the written form
    > of one or more languages.>>
    > and "graphic character" is defined as :
    > << 4.20 graphic character: A character, other than a control function,
    > that has a visual representation normally handwritten, printed, or
    > displayed.>>
    > So I guess if any further definition of "script" is necessary it
    > should be based on this.
    > Further the (draft?) ISO 15924 standard uses the same definition
    > << 3.7 script A set of graphic characters used for the written
    > form of one or more languages.(ISO/IEC 10646-
    > 1)(fr 3.6 écriture )>>
    > but adds an extra note:
    > << NOTE 1:A script,as opposed to an arbitrary subset of
    > characters,is defined in distinction to other scripts;in
    > general,readers of one script may be unable to read the
    > glyphs of another script easily,even where there is a
    > historic relation between them (see 3.9).>>
    > [ 3.9 script variant
    > A particular form of one script which is so
    > distinctive a rendering as to almost be considered
    > a unique script in itself.(fr 3.9 variante d ’écriture )]
    > With regard to historic & archaic scripts TUS itself states
    > "The overall capacity for more than a million characters is more than
    > sufficient for all known character encoding requirements, including
    > full coverage of all minority and historic scripts of the world. "
    > (1.0 )
    > and
    > "As the universal character encoding scheme, the Unicode Standard must
    > also respond to scholarly needs. To preserve world cultural heritage,
    > important archaic scripts are encoded as proposals are developed."
    > (1.1.2)
    > So there is a clear statement of purpose to give full coverage to
    > *all* minority and historic scripts and to encode "important" archaic
    > scripts.
    > In 1.2 "Design Goals" TUS states:
    > "The primary goal of the development effort for the Unicode Standard
    > was to remedy two serious problems common to most multilingual
    > computer programs. The first problem was the overloading of the font
    > mechanism when encoding characters."
    > Telling people who propose a script that they can "just use a
    > different font " could very easily contradict this stated goal.
    >> There are very real issues of software implementation, font
    >> development, collation, text indexing and searching, etc. that arise
    >> from encoding multiple instances of what some users consider a single
    >> script, whether users in general opt to make the distinction in plain
    >> text or not, by using the separate character collections or unifying
    >> text in a single character collection and making the distinction at a
    >> higher level. I'm beginning to think that our time would be better
    >> spent thinking about those issues.
    > These are of course real issues - particularly collation, text
    > indexing, searching and - where a written language occurs in several
    > scripts - the ability to display text encoded in one script with
    > glyphs of another. Establishing standard, straightforward and widely
    > supported means to deal with these issues is a worthy goal. In many
    > cases the solutions for these problems is in fact already specified
    > or pretty clear - and, relatively speaking , these are reasonably
    > straightforward to implement.
    > Thier absecence - or lack of support - should not be a reason to
    > reject a script proposal on the grounds that "it will cause
    > difficulties" - this is sort of kind of argument put forward by PR
    > China when they submitted their proposal for a host of precomposed
    > Tibetan characters. When Indic scripts were first encoded a whole
    > software infrastructure and font/rendering technologies which were not
    > then available in common desktop operating systems was assumed - and
    > it has taken a decade for this encoding to be anything like widely
    > supported on a practical level. The solutions for these problems
    > already specified or pretty clear - and, relatively speaking ,
    > reasonably straightforward to implement.
    > IMO, in the long term, encoding of archaic scripts is going to benefit
    > the whole scholarly community. When children discover all kinds of
    > scripts on their computers they are going to become curious and play
    > with them and some of them will be inspired to go out and find out
    > more about these scripts. Some of these will develop a serious
    > interest and a few will end up being the Palaeographers, Semiticists,
    > Sanskritists and so on of tomorrow.
    > - Chris

    This archive was generated by hypermail 2.1.5 : Sun May 30 2004 - 14:33:45 CDT