Transcribing old documents into Unicode compatible document files.

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Sat May 03 2003 - 08:12:32 EDT

Next message: Theodore H. Smith: "Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"

Previous message: John Cowan: "Re: U+006E U+0308 spotted in the wild"
Next in thread: Doug Ewell: "Re: Transcribing old documents into Unicode compatible document files."
Reply: Doug Ewell: "Re: Transcribing old documents into Unicode compatible document files."
Reply: John Hudson: "Re: Transcribing old documents into Unicode compatible document files."
Maybe reply: William Overington: "Re: Transcribing old documents into Unicode compatible document files."
Maybe reply: William Overington: "Re: Transcribing old documents into Unicode compatible document files."
Maybe reply: John Hudson: "Re: Transcribing old documents into Unicode compatible document files."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I am currently designing and producing a font which I hope will have various
uses, including transcribing old documents into Unicode compatible computer
document files.

The latest development version, Quest text 052, is now available on the web
from the following web page. Quest text is the latest item on the web page
at present and is near the end of the page. There are a number of new
characters from regular Unicode and some more Private Use Area ligatures
included in this version of the font compared with the previously available
Quest text 048 font.

http://www.users.globalnet.co.uk/~ngo/font7001.htm

I have in mind the possibility that someone could transcribe a written
document or a printed document. I am wondering about what happens when
someone transcribing a document finds a character which is both not in
regular Unicode and not in any Private Use Area encoding which he or she may
be using. Certainly, later, the person could sit down and think about the
newly found character and decide to devise a new character or otherwise as
he or she thinks fit, perhaps after discussion with other people. However,
at the time, perhaps using a lap top portable computer in a library setting
the person has to find a solution promptly.

It is certainly possible to use digit characters, but that could lead to
confusion, particularly if numbers are used in the original document, so I
am wondering about whether it would be a good idea, or whether it has
already been done by anyone, to design a number of constructed glyphs which
are for the purpose of having no established meaning yet are available as
temporary characters with which to produce a Unicode compatible computer
document file. These would initially be encoded in the Private Use Area but
perhaps it might be a possibility for them to be promoted to regular Unicode
at a later date, as Private Use Meaning characters, so that the general
shape of each glyph is formally defined and the character has a formal name,
yet it has no fixed meaning and is intended for use on a temporary basis to
represent an unknown character, yet to represent an unknown character in a
data-recoverable manner. For example, maybe a block of sixteen such
characters could be defined. I am thinking of the glyphs having a
resemblance in general terms to Latin alphabet characters, yet being
abstract as to meaning.

I have had a look at the Conscript Unicode Registry at
http://www.evertype.com/standards/csur/index.html yet, as far as I can find
at present, nothing seems to be of the required nature.

Another possibility would be to include some of the regular Unicode
geometric shapes in the font so that they could be used for the purpose as
needed. Or perhaps the circled number characters are the answer.

However, I thought that I would mention the topic here so as hopefully to
find out how people transcribing documents into a computer system who find
an unknown character proceed at present and how they would like to proceed
in the future.

William Overington

3 May 2003

Next message: Theodore H. Smith: "Implementing on UTF8: toUpper(), toFold(), normalisation, collation, etc"
Previous message: John Cowan: "Re: U+006E U+0308 spotted in the wild"
Next in thread: Doug Ewell: "Re: Transcribing old documents into Unicode compatible document files."
Reply: Doug Ewell: "Re: Transcribing old documents into Unicode compatible document files."
Reply: John Hudson: "Re: Transcribing old documents into Unicode compatible document files."
Maybe reply: William Overington: "Re: Transcribing old documents into Unicode compatible document files."
Maybe reply: William Overington: "Re: Transcribing old documents into Unicode compatible document files."
Maybe reply: John Hudson: "Re: Transcribing old documents into Unicode compatible document files."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat May 03 2003 - 09:13:01 EDT