RE: what is Latn?

From: JFC (Jefsey) Morfin (
Date: Wed May 18 2005 - 20:03:10 CDT

  • Next message: JFC (Jefsey) Morfin: "Re: ASCII and Unicode lifespan"

    At 17:56 18/05/2005, Peter Constable wrote:
    > > But when you have orthogonal things to relate, as you want to in
    > > several documents, you need to have a relational system.
    >I don't disagree; I was only objecting to the critique that ISO 15924 is
    >faulty because it isn't relational in the sense you refer to.

    Here I am lost. I do not see how ISO 15924 could be faulty. It is a list.
    There are billions of lists. They what they are: lists. They can be piles
    of names, numbers, poems, etc.

    Now where there is a problem is when you want to use some of their items
    without having given them a meaning first, which means "item=definition".

    > > As a computer you go by binary stuff. If a computer is to relate
    >French and
    > > Latn, it must have binary element it can compare using a program.
    > >
    > > Now, a person with a bit of logic will do the same.
    > >
    > > As long as you do not tell me what is in Latn, I cannot tell you if it
    > > applies to French.
    >If you want to dumb down to a level of not assuming anything that's not
    >stated explicitly, then you are right. I doubt there are many here who
    >operate that way on a regular basis.

    This is the difference between poets and programer. We both live in the
    same world, but the poet believe his dream is enough, the programer knows
    that he has to declare the things before using them - and that gives him a
    lot of possibilities. You are a smart guy but you are used: your document
    quoted in ISO 639-4 is used to support an erroneous proposition I do not
    think you really support if you analyse it.

    Please consider carefully what happens in real life. Your case and mine.

    1. your case. You think you can assume things which have not been stated
    explicitly. I have no problem with that, but in programing (or physic, or
    mathematics) it is named a constant. This means this not even a default, it
    means that this is something (even if you do not realise it) you assume as
    universal, created in. This is good for a few universal constants like the
    speed of the light, etc. But others are actually common understandings,
    i.e. part of a culture. The more they are, the less all of them are assumed
    the same by everyone. You say I do not need to define "Latn". May be you
    talking with Mark, Phillip and Michael. But if I ask (this what I did) to a
    Unicode list "what is Latn", responses are numerous and confuse. You
    believed you could assume there is only one subjective, precise, intuitive,
    etc. I do no know, but one obvious response. There is none: there is a
    controversy. Otherwise the thread would be closed. Result is that you are
    mudded trying to define that single meaning you assumed.

    2. my case. I know that Michael did his home work and Latn is a good name
    for a script. I know that a script is a "set of graphic characters used for
    the written form of one or several languages" (even if I have some problems
    with that definition). This does not tell me what is the set for "Latn". So
    I can ask the Unicode list about Latn and French, and get from some the
    components which should be in the set, there are people in here quite
    knowledgeable. What is interesting is that I can do the list as a partition
    of the ISO 10646 global character set (actually I cannot, but nearly). This
    is interesting because I can now work on several defined alternatives.
    Where you fought to try to define a concept, I have different names lists,
    perfectly clear to others, stable and workable. I can given them variant
    numbers, discuss them, tune them ... and say if yes or not one of them
    support French. And if the same/others ones support other languages. To do
    that I am not going to argue for hours about the particular case of that
    letter in that language, etc. I am going to look at the norms associated to
    that language and at the associated alphabet. If it matches one of the
    variant, that variant is correct.

    You are going to tell me "which norms"? This is the main problem in the
    whole Davis Addison / Constable logic (quotes of the ISO 639-4 draft): you
    also assume that you also do not need to consider the norms (except
    orthography (why?)). This is not because they may not be documented that
    norms do not exist (grammar, semantic, styles, level of complexity, etc. ).
    I accept this is less worked in English than in French (actually it is
    quite worked in English/American, but you do not realise it: consider the
    lingual obligations put by the DoD to its contractors, consider the very
    concept of "Basic English") - but you confused language with a norm set
    with "computer languages".

    (By the way I do not find the English mathematic word equivalent to
    "normé", meaning the norms of which have been declared or identified - not

    The very key element missing (and for the time being) killing all the logic
    of your ISO/IETF proposition is to forget about the way people speak -
    their best common relational practices. The normating rules set associated
    to languages, structuring its reference system (I abbreviate as
    "referent"). This is what permits a computer to understand, correct, and
    talk them. Then you have the "style", the way they use it, to fully qualify
    the language - with possible iterative/reciprocal influences.

    Now, I fully accept that you can document a language/page in using
    intuitive/subjective/assumed descriptors only, but for humans who will
    post-assume (with occasional misunderstanding most probably) what you meant
    to say - what you have yourself assumed. But you cannot for applications.
    And the risk of confusions/conflicts will probably be very high if that
    humans are from a different culture). This is why there is a major
    difference between an informative and a normative description, the very
    first point to discuss about terms and definitions/purposes. The very first
    question to rise in point ISO 639-4/4.8 and in the xx.txt Drafts.

    Even - and may be them first - your "end users" (I understand as "out of
    the reach of SDOs") can understand that: I tend to observe it is more
    difficult for experts who are more involved in their stuff. But this is a
    discussion we already had.

    > > And please do not quote Unicode Character Set as a middle reference.
    > > It is not an ISO Standard, and it does not fully support French.
    >Not fully support French? Funny thing, then, that no national body of a
    >country with a significant Francophone population has been bringing
    >their request to WG2, and that no member of the Unicode Consortium
    >selling products to Francophone markets has been requesting changes
    >needed to meet the requirements of those markets. I'm curious to know
    >what the lacuna might be.

    Funny thing, that you are so unaware of Microsoft products and clients. You
    did not know about non ASCII programing environment, now that. Please ask
    your people from Word the compromise they found to support question marks
    at the end of a sentence, obliging all of us to rephrase if Word wants to
    put it at the beginning of the next line :-) Best that the Unicode lacks,
    but not perfect.

    The story about the horses asses, and the comment about a different origin,
    were interesting: the negative comment did not realise that his more modern
    origin was the ponies asses in British mines. Unicode or ISO or ECMA, etc.
    is not the origin: origin is the characters and the people. But the story
    also shows the hysteresis. We will not rebuild Unicode, it is a step ahead
    which will stay. But there will be other steps.

    You give the response: "members of the Unicode Consortium selling". We take
    Unicode as a good commercial effort by a private company cartel with due
    commercial motivations. Even if IAB lacks understanding about languages,
    they understand R&D funding and results. They have perfectly qualified
    Unicode (as most of the current efforts) in RFC 3869. We/I share that view.

    What is interesting however, is that the work I do on CRCs, shows us as we
    can canonically, in a language and koine/autonym independent way use and
    correct the Unicode lacks ... provided a few more common sense practice and
    concepts are included in ISO 639-4, IETF Drafts and global network culture.
    Please consider OSI if you known it. It was the international network
    second generation: it was specified in four or six languages and the
    technology was totally multilingual as being bit oriented. What OSI did, we
    should be able to make it better.

    Take care.

    This archive was generated by hypermail 2.1.5 : Thu May 19 2005 - 10:07:17 CDT