From: Christopher Fynn (cfynn@gmx.net)
Date: Sun May 30 2004 - 14:29:33 CDT
Sorry about the garbled Subject line in my previous post. I don't know
how that happened as the original in my Sent folder looks OK
Christopher Fynn wrote:
> John Hudson wrote:
>
> ....
>
>> I have been thinking today that part of the reason for the debate is
>> that Unicode has a singular concept of 'script', a bucket into which
>> variously shaped concepts of writing systems must be put or rejected.
>> I don't think there is anything conceptually wrong with the idea that
>> specific instances of a single script might be separately encoded if
>> there is a need or desire to distinguish them in plain text. It just
>> happens that Unicode has only one word that can be applied to such
>> instances, and that is 'script'. It seems clear to me now that what
>> Unicode calls a script needn't necessarily be what semiticists, or
>> anyone else, calls a script. A functional Unicode definition of
>> script might be formed as: a finite collection of characters that can
>> be distinguished in plain text from other collections of characters.
>
>
>
>
> John
>
> "Script" is already defined in ISO 10646 as:
>
> <<4.35 script: A set of graphic characters used for the written form
> of one or more languages.>>
>
> and "graphic character" is defined as :
>
> << 4.20 graphic character: A character, other than a control function,
> that has a visual representation normally handwritten, printed, or
> displayed.>>
>
> So I guess if any further definition of "script" is necessary it
> should be based on this.
>
> Further the (draft?) ISO 15924 standard uses the same definition
>
> << 3.7 script A set of graphic characters used for the written
> form of one or more languages.(ISO/IEC 10646-
> 1)(fr 3.6 écriture )>>
>
> but adds an extra note:
>
> << NOTE 1:A script,as opposed to an arbitrary subset of
> characters,is defined in distinction to other scripts;in
> general,readers of one script may be unable to read the
> glyphs of another script easily,even where there is a
> historic relation between them (see 3.9).>>
>
> [ 3.9 script variant
> A particular form of one script which is so
> distinctive a rendering as to almost be considered
> a unique script in itself.(fr 3.9 variante d ’écriture )]
>
> With regard to historic & archaic scripts TUS itself states
> "The overall capacity for more than a million characters is more than
> sufficient for all known character encoding requirements, including
> full coverage of all minority and historic scripts of the world. "
> (1.0 )
>
> and
>
> "As the universal character encoding scheme, the Unicode Standard must
> also respond to scholarly needs. To preserve world cultural heritage,
> important archaic scripts are encoded as proposals are developed."
> (1.1.2)
>
> So there is a clear statement of purpose to give full coverage to
> *all* minority and historic scripts and to encode "important" archaic
> scripts.
>
> In 1.2 "Design Goals" TUS states:
> "The primary goal of the development effort for the Unicode Standard
> was to remedy two serious problems common to most multilingual
> computer programs. The first problem was the overloading of the font
> mechanism when encoding characters."
>
> Telling people who propose a script that they can "just use a
> different font " could very easily contradict this stated goal.
>
>> There are very real issues of software implementation, font
>> development, collation, text indexing and searching, etc. that arise
>> from encoding multiple instances of what some users consider a single
>> script, whether users in general opt to make the distinction in plain
>> text or not, by using the separate character collections or unifying
>> text in a single character collection and making the distinction at a
>> higher level. I'm beginning to think that our time would be better
>> spent thinking about those issues.
>>
> These are of course real issues - particularly collation, text
> indexing, searching and - where a written language occurs in several
> scripts - the ability to display text encoded in one script with
> glyphs of another. Establishing standard, straightforward and widely
> supported means to deal with these issues is a worthy goal. In many
> cases the solutions for these problems is in fact already specified
> or pretty clear - and, relatively speaking , these are reasonably
> straightforward to implement.
>
> Thier absecence - or lack of support - should not be a reason to
> reject a script proposal on the grounds that "it will cause
> difficulties" - this is sort of kind of argument put forward by PR
> China when they submitted their proposal for a host of precomposed
> Tibetan characters. When Indic scripts were first encoded a whole
> software infrastructure and font/rendering technologies which were not
> then available in common desktop operating systems was assumed - and
> it has taken a decade for this encoding to be anything like widely
> supported on a practical level. The solutions for these problems
> already specified or pretty clear - and, relatively speaking ,
> reasonably straightforward to implement.
>
> IMO, in the long term, encoding of archaic scripts is going to benefit
> the whole scholarly community. When children discover all kinds of
> scripts on their computers they are going to become curious and play
> with them and some of them will be inspired to go out and find out
> more about these scripts. Some of these will develop a serious
> interest and a few will end up being the Palaeographers, Semiticists,
> Sanskritists and so on of tomorrow.
>
> - Chris
>
>
>
This archive was generated by hypermail 2.1.5 : Sun May 30 2004 - 14:33:45 CDT