Re: what is Latn?

From: Erkki Kolehmainen (
Date: Fri May 20 2005 - 09:07:23 CDT

  • Next message: Nelson H. F. Beebe: "Re: ASCII and Unicode lifespan, and the Deseret alphabet"

    When trying to follow this thread, I fail to understand why Mr. Morfin
    keeps on insisting that a script could and should be defined as a closed
    set (other than e.g. as a snapshot of what has been defined in a given
    version of ISO 1064 _and_ Unicode). We may and do still encounter new
    Latin characters (and more so: new characters of other scripts), many of
    which will undoubtedly lead to a discussion on whether they are truly
    new - yet to be encoded - characters or glyph variations or whatever. -
    Nevertheless, a rose is a rose.

    Incidentally, as there is no absolute truth, it gets to be truly
    impossible to agree on what characters (of a given script) are used for
    a given language. We honestly tried that a few years ago in a CEN
    Workshop for the languages of Europe (which we couldn't agree on, either).

    Erkki I. Kolehmainen

    JFC (Jefsey) Morfin wrote:

    > At 17:56 18/05/2005, Peter Constable wrote:
    >> > But when you have orthogonal things to relate, as you want to in
    >> > several documents, you need to have a relational system.
    >> I don't disagree; I was only objecting to the critique that ISO 15924 is
    >> faulty because it isn't relational in the sense you refer to.
    > ???
    > Here I am lost. I do not see how ISO 15924 could be faulty. It is a
    > list. There are billions of lists. They what they are: lists. They can
    > be piles of names, numbers, poems, etc.
    > Now where there is a problem is when you want to use some of their items
    > without having given them a meaning first, which means "item=definition".
    >> > As a computer you go by binary stuff. If a computer is to relate
    >> French and
    >> > Latn, it must have binary element it can compare using a program.
    >> >
    >> > Now, a person with a bit of logic will do the same.
    >> >
    >> > As long as you do not tell me what is in Latn, I cannot tell you if it
    >> Latn
    >> > applies to French.
    >> If you want to dumb down to a level of not assuming anything that's not
    >> stated explicitly, then you are right. I doubt there are many here who
    >> operate that way on a regular basis.
    > This is the difference between poets and programer. We both live in the
    > same world, but the poet believe his dream is enough, the programer
    > knows that he has to declare the things before using them - and that
    > gives him a lot of possibilities. You are a smart guy but you are used:
    > your document quoted in ISO 639-4 is used to support an erroneous
    > proposition I do not think you really support if you analyse it.
    > Please consider carefully what happens in real life. Your case and mine.
    > 1. your case. You think you can assume things which have not been stated
    > explicitly. I have no problem with that, but in programing (or physic,
    > or mathematics) it is named a constant. This means this not even a
    > default, it means that this is something (even if you do not realise it)
    > you assume as universal, created in. This is good for a few universal
    > constants like the speed of the light, etc. But others are actually
    > common understandings, i.e. part of a culture. The more they are, the
    > less all of them are assumed the same by everyone. You say I do not need
    > to define "Latn". May be you talking with Mark, Phillip and Michael. But
    > if I ask (this what I did) to a Unicode list "what is Latn", responses
    > are numerous and confuse. You believed you could assume there is only
    > one subjective, precise, intuitive, etc. I do no know, but one obvious
    > response. There is none: there is a controversy. Otherwise the thread
    > would be closed. Result is that you are mudded trying to define that
    > single meaning you assumed.
    > 2. my case. I know that Michael did his home work and Latn is a good
    > name for a script. I know that a script is a "set of graphic characters
    > used for the written form of one or several languages" (even if I have
    > some problems with that definition). This does not tell me what is the
    > set for "Latn". So I can ask the Unicode list about Latn and French, and
    > get from some the components which should be in the set, there are
    > people in here quite knowledgeable. What is interesting is that I can do
    > the list as a partition of the ISO 10646 global character set (actually
    > I cannot, but nearly). This is interesting because I can now work on
    > several defined alternatives. Where you fought to try to define a
    > concept, I have different names lists, perfectly clear to others, stable
    > and workable. I can given them variant numbers, discuss them, tune them
    > ... and say if yes or not one of them support French. And if the
    > same/others ones support other languages. To do that I am not going to
    > argue for hours about the particular case of that letter in that
    > language, etc. I am going to look at the norms associated to that
    > language and at the associated alphabet. If it matches one of the
    > variant, that variant is correct.
    > You are going to tell me "which norms"? This is the main problem in the
    > whole Davis Addison / Constable logic (quotes of the ISO 639-4 draft):
    > you also assume that you also do not need to consider the norms (except
    > orthography (why?)). This is not because they may not be documented that
    > norms do not exist (grammar, semantic, styles, level of complexity, etc.
    > ). I accept this is less worked in English than in French (actually it
    > is quite worked in English/American, but you do not realise it: consider
    > the lingual obligations put by the DoD to its contractors, consider the
    > very concept of "Basic English") - but you confused language with a norm
    > set with "computer languages".
    > (By the way I do not find the English mathematic word equivalent to
    > "normé", meaning the norms of which have been declared or identified -
    > not decided/invented).
    > The very key element missing (and for the time being) killing all the
    > logic of your ISO/IETF proposition is to forget about the way people
    > speak - their best common relational practices. The normating rules set
    > associated to languages, structuring its reference system (I abbreviate
    > as "referent"). This is what permits a computer to understand, correct,
    > and talk them. Then you have the "style", the way they use it, to fully
    > qualify the language - with possible iterative/reciprocal influences.
    > Now, I fully accept that you can document a language/page in using
    > intuitive/subjective/assumed descriptors only, but for humans who will
    > post-assume (with occasional misunderstanding most probably) what you
    > meant to say - what you have yourself assumed. But you cannot for
    > applications. And the risk of confusions/conflicts will probably be very
    > high if that humans are from a different culture). This is why there is
    > a major difference between an informative and a normative description,
    > the very first point to discuss about terms and definitions/purposes.
    > The very first question to rise in point ISO 639-4/4.8 and in the xx.txt
    > Drafts.
    > Even - and may be them first - your "end users" (I understand as "out of
    > the reach of SDOs") can understand that: I tend to observe it is more
    > difficult for experts who are more involved in their stuff. But this is
    > a discussion we already had.
    >> > And please do not quote Unicode Character Set as a middle reference.
    >> > It is not an ISO Standard, and it does not fully support French.
    >> Not fully support French? Funny thing, then, that no national body of a
    >> country with a significant Francophone population has been bringing
    >> their request to WG2, and that no member of the Unicode Consortium
    >> selling products to Francophone markets has been requesting changes
    >> needed to meet the requirements of those markets. I'm curious to know
    >> what the lacuna might be.
    > Funny thing, that you are so unaware of Microsoft products and clients.
    > You did not know about non ASCII programing environment, now that.
    > Please ask your people from Word the compromise they found to support
    > question marks at the end of a sentence, obliging all of us to rephrase
    > if Word wants to put it at the beginning of the next line :-) Best that
    > the Unicode lacks, but not perfect.
    > The story about the horses asses, and the comment about a different
    > origin, were interesting: the negative comment did not realise that his
    > more modern origin was the ponies asses in British mines. Unicode or ISO
    > or ECMA, etc. is not the origin: origin is the characters and the
    > people. But the story also shows the hysteresis. We will not rebuild
    > Unicode, it is a step ahead which will stay. But there will be other steps.
    > You give the response: "members of the Unicode Consortium selling". We
    > take Unicode as a good commercial effort by a private company cartel
    > with due commercial motivations. Even if IAB lacks understanding about
    > languages, they understand R&D funding and results. They have perfectly
    > qualified Unicode (as most of the current efforts) in RFC 3869. We/I
    > share that view.
    > What is interesting however, is that the work I do on CRCs, shows us as
    > we can canonically, in a language and koine/autonym independent way use
    > and correct the Unicode lacks ... provided a few more common sense
    > practice and concepts are included in ISO 639-4, IETF Drafts and global
    > network culture. Please consider OSI if you known it. It was the
    > international network second generation: it was specified in four or six
    > languages and the technology was totally multilingual as being bit
    > oriented. What OSI did, we should be able to make it better.
    > Take care.
    > jfc

    This archive was generated by hypermail 2.1.5 : Fri May 20 2005 - 09:08:08 CDT