From: Erkki Kolehmainen (erkki.kolehmainen@kotus.fi)
Date: Fri May 20 2005 - 09:07:23 CDT
When trying to follow this thread, I fail to understand why Mr. Morfin
keeps on insisting that a script could and should be defined as a closed
set (other than e.g. as a snapshot of what has been defined in a given
version of ISO 1064 _and_ Unicode). We may and do still encounter new
Latin characters (and more so: new characters of other scripts), many of
which will undoubtedly lead to a discussion on whether they are truly
new - yet to be encoded - characters or glyph variations or whatever. -
Nevertheless, a rose is a rose.
Incidentally, as there is no absolute truth, it gets to be truly
impossible to agree on what characters (of a given script) are used for
a given language. We honestly tried that a few years ago in a CEN
Workshop for the languages of Europe (which we couldn't agree on, either).
Erkki I. Kolehmainen
JFC (Jefsey) Morfin wrote:
> At 17:56 18/05/2005, Peter Constable wrote:
>
>> > But when you have orthogonal things to relate, as you want to in
>> > several documents, you need to have a relational system.
>>
>> I don't disagree; I was only objecting to the critique that ISO 15924 is
>> faulty because it isn't relational in the sense you refer to.
>
>
> ???
> Here I am lost. I do not see how ISO 15924 could be faulty. It is a
> list. There are billions of lists. They what they are: lists. They can
> be piles of names, numbers, poems, etc.
>
> Now where there is a problem is when you want to use some of their items
> without having given them a meaning first, which means "item=definition".
>
>> > As a computer you go by binary stuff. If a computer is to relate
>> French and
>> > Latn, it must have binary element it can compare using a program.
>> >
>> > Now, a person with a bit of logic will do the same.
>> >
>> > As long as you do not tell me what is in Latn, I cannot tell you if it
>> Latn
>> > applies to French.
>>
>> If you want to dumb down to a level of not assuming anything that's not
>> stated explicitly, then you are right. I doubt there are many here who
>> operate that way on a regular basis.
>
>
> This is the difference between poets and programer. We both live in the
> same world, but the poet believe his dream is enough, the programer
> knows that he has to declare the things before using them - and that
> gives him a lot of possibilities. You are a smart guy but you are used:
> your document quoted in ISO 639-4 is used to support an erroneous
> proposition I do not think you really support if you analyse it.
>
> Please consider carefully what happens in real life. Your case and mine.
>
> 1. your case. You think you can assume things which have not been stated
> explicitly. I have no problem with that, but in programing (or physic,
> or mathematics) it is named a constant. This means this not even a
> default, it means that this is something (even if you do not realise it)
> you assume as universal, created in. This is good for a few universal
> constants like the speed of the light, etc. But others are actually
> common understandings, i.e. part of a culture. The more they are, the
> less all of them are assumed the same by everyone. You say I do not need
> to define "Latn". May be you talking with Mark, Phillip and Michael. But
> if I ask (this what I did) to a Unicode list "what is Latn", responses
> are numerous and confuse. You believed you could assume there is only
> one subjective, precise, intuitive, etc. I do no know, but one obvious
> response. There is none: there is a controversy. Otherwise the thread
> would be closed. Result is that you are mudded trying to define that
> single meaning you assumed.
>
> 2. my case. I know that Michael did his home work and Latn is a good
> name for a script. I know that a script is a "set of graphic characters
> used for the written form of one or several languages" (even if I have
> some problems with that definition). This does not tell me what is the
> set for "Latn". So I can ask the Unicode list about Latn and French, and
> get from some the components which should be in the set, there are
> people in here quite knowledgeable. What is interesting is that I can do
> the list as a partition of the ISO 10646 global character set (actually
> I cannot, but nearly). This is interesting because I can now work on
> several defined alternatives. Where you fought to try to define a
> concept, I have different names lists, perfectly clear to others, stable
> and workable. I can given them variant numbers, discuss them, tune them
> ... and say if yes or not one of them support French. And if the
> same/others ones support other languages. To do that I am not going to
> argue for hours about the particular case of that letter in that
> language, etc. I am going to look at the norms associated to that
> language and at the associated alphabet. If it matches one of the
> variant, that variant is correct.
>
> You are going to tell me "which norms"? This is the main problem in the
> whole Davis Addison / Constable logic (quotes of the ISO 639-4 draft):
> you also assume that you also do not need to consider the norms (except
> orthography (why?)). This is not because they may not be documented that
> norms do not exist (grammar, semantic, styles, level of complexity, etc.
> ). I accept this is less worked in English than in French (actually it
> is quite worked in English/American, but you do not realise it: consider
> the lingual obligations put by the DoD to its contractors, consider the
> very concept of "Basic English") - but you confused language with a norm
> set with "computer languages".
>
> (By the way I do not find the English mathematic word equivalent to
> "normé", meaning the norms of which have been declared or identified -
> not decided/invented).
>
> The very key element missing (and for the time being) killing all the
> logic of your ISO/IETF proposition is to forget about the way people
> speak - their best common relational practices. The normating rules set
> associated to languages, structuring its reference system (I abbreviate
> as "referent"). This is what permits a computer to understand, correct,
> and talk them. Then you have the "style", the way they use it, to fully
> qualify the language - with possible iterative/reciprocal influences.
>
> Now, I fully accept that you can document a language/page in using
> intuitive/subjective/assumed descriptors only, but for humans who will
> post-assume (with occasional misunderstanding most probably) what you
> meant to say - what you have yourself assumed. But you cannot for
> applications. And the risk of confusions/conflicts will probably be very
> high if that humans are from a different culture). This is why there is
> a major difference between an informative and a normative description,
> the very first point to discuss about terms and definitions/purposes.
> The very first question to rise in point ISO 639-4/4.8 and in the xx.txt
> Drafts.
>
> Even - and may be them first - your "end users" (I understand as "out of
> the reach of SDOs") can understand that: I tend to observe it is more
> difficult for experts who are more involved in their stuff. But this is
> a discussion we already had.
>
>> > And please do not quote Unicode Character Set as a middle reference.
>> > It is not an ISO Standard, and it does not fully support French.
>>
>> Not fully support French? Funny thing, then, that no national body of a
>> country with a significant Francophone population has been bringing
>> their request to WG2, and that no member of the Unicode Consortium
>> selling products to Francophone markets has been requesting changes
>> needed to meet the requirements of those markets. I'm curious to know
>> what the lacuna might be.
>
>
> Funny thing, that you are so unaware of Microsoft products and clients.
> You did not know about non ASCII programing environment, now that.
> Please ask your people from Word the compromise they found to support
> question marks at the end of a sentence, obliging all of us to rephrase
> if Word wants to put it at the beginning of the next line :-) Best that
> the Unicode lacks, but not perfect.
>
> The story about the horses asses, and the comment about a different
> origin, were interesting: the negative comment did not realise that his
> more modern origin was the ponies asses in British mines. Unicode or ISO
> or ECMA, etc. is not the origin: origin is the characters and the
> people. But the story also shows the hysteresis. We will not rebuild
> Unicode, it is a step ahead which will stay. But there will be other steps.
>
> You give the response: "members of the Unicode Consortium selling". We
> take Unicode as a good commercial effort by a private company cartel
> with due commercial motivations. Even if IAB lacks understanding about
> languages, they understand R&D funding and results. They have perfectly
> qualified Unicode (as most of the current efforts) in RFC 3869. We/I
> share that view.
>
> What is interesting however, is that the work I do on CRCs, shows us as
> we can canonically, in a language and koine/autonym independent way use
> and correct the Unicode lacks ... provided a few more common sense
> practice and concepts are included in ISO 639-4, IETF Drafts and global
> network culture. Please consider OSI if you known it. It was the
> international network second generation: it was specified in four or six
> languages and the technology was totally multilingual as being bit
> oriented. What OSI did, we should be able to make it better.
>
> Take care.
> jfc
>
>
This archive was generated by hypermail 2.1.5 : Fri May 20 2005 - 09:08:08 CDT