> > It's *much* easier -- and, in the long term, safer -- for them to
> > select from the extensive inventory of characters available in
> Unicode and
> > to avoid using ASCII punctuation characters with redefined word-building
> > semantics.
> I don't get what you are saying here, why should people be limited to
> ASCII punctuation characters? With GNU libc you can declare your own set
> of punctuation characters in the locale, and they can be any 10646
> character. Or are you referring to the specific locale syntax from
> POSIX/TR 14652?
I think what Peter is saying is that, although you CAN create an orthography
that uses any combination of stuff, it is a bad idea to ignore the Unicode
character properties and use whatever comes to hand (like punctuation
Yes, you can program a single C program (or Java program, or what-have-you)
to know how to process your text. But you still face the enormous amount of
software that *doesn't* understand. In the ideal world, you can use your
orthography in Microsoft Word (or StarOffice if you prefer) and not have the
grammer checker destroy your text automagically. Building an orthography
that recognizes this make more sense than having to write "TengvaWord" and
"TengvaExcel" and "TengvaMail" and so on.
Programs that run on platforms that have user-defined locales can get some
of this from providing a locale to use (and switching to it), but there is
always the risk that a programmer has taken a "shortcut" somewhere and is
looking for @ or ! or whatever (for example, if I type an @ in certain
programs surrounded by ASCII text, the program will convert it to a mailto:
In short, the question isn't whether something is or isn't possible, but
rather whether something is or isn't a good idea or desirable. If you've had
an orthography since 1937 based on locally available typewriters, then you
probably won't want to change. If you have NO writing tradition, you're
better off avoiding unnecessary headaches, IMO.
This archive was generated by hypermail 2.1.2 : Mon Jul 29 2002 - 16:04:21 EDT