Misuse of 8th bit [Was: My Querry]

From: Antoine Leca (Antoine10646@leca-marti.org)
Date: Thu Nov 25 2004 - 04:19:45 CST

  • Next message: Mark Davis: "Re: CGJ , RLM"

    On Wednesday, November 24th, 2004 22:16Z Asmus Freytag va escriure:
    >
    > I'm not seeing a lot in this thread that adds to the store of
    > knowledge on this issue, but I see a number of statements that are
    > easily misconstrued or misapplied, including the thoroughly
    > discredited practice of storing information in the high
    > bit, when piping seven-bit data through eight-bit pathways. The
    > problem with that approach, of course, is that the assumption
    > that there were never going to be 8-bit data in these same pipes
    > proved fatally wrong.

    Since I was the person who did introduce this theme into the thread, I feel
    there is an important point that should be highlighted here. The "widely
    discredited practice of storing information in the high bit" is in fact like
    the Y2K problem, a bad consequence of past practices. Only difference is
    that we do not have a hard time limit to solve it.

    The practice itself did disappear quite a long time ago (as I wrote, myself
    did use it back in 1980 and perhaps also in 1984 in a Forth interpreter that
    did overuse this "feature"), and right now nobody in his common sense will
    even think of this idea
    (OK, this is too strong, certainly one can show me examples of present day
    uses, probably more in the U.S.A. than elsewhere; just as I was able to
    encounter projects /designed/ in 1998 with years stored as 2 digits, and
    then collating dates on YYMMDD.)

    However, what is a real problem right now is the still widely expanded idea
    that this feature is still abundant, and that the data should be
    "*corrected*". So one should use toascii() and similar mechanism that takes
    the /supposed corrupt/ input and make it "good compliant 8-bit US-ASCII" as
    some of the answers that were made to me pointed out.

    It should be now obvious that a program that *keeps* a eventual parity
    information received on a telecommunication line and pass it unmodified to
    the next DTE, is less a problem with respect to eventual UTF-8 data that the
    equivalent program that actually *removes* unconditionnally the 8th bit.

    The crude reality is that the problem you are referring above really comes
    from these castrating practices, NOT from the now retired programs of the
    '70s that for economy did re-use the 8th bit to store another information
    along the pipeline.
    And I am noting that nobody advocated in this thread about USING the 8th
    bit. However, I saw remarks about possible PREVIOUS uses of it (and these
    remarks were accompanied by the relevant "I remember" and "it reminds me"
    that might show advises from experimented people toward newbies rather than
    easily misconstrued or misapplied statements).
    On the other hand I also saw references to practices of /discarding/ the 8th
    bit when one receives "USASCII" data (some might even be misconstrued to
    make one believe it was normative to do so); and there latter references did
    not come with the same "I remember" markers, quite the contrary; and present
    practices of Internet mail will quickly show that these practices are still
    in use.

    In other words, I believe the practice of /storing/ data into the 8th bit is
    effectively discredited. What we really need today is to discredit ALSO the
    practice of /removing/ information from the 8th bit.

    Antoine



    This archive was generated by hypermail 2.1.5 : Thu Nov 25 2004 - 08:12:36 CST