Re: UTF-8: Michael takes the plunge

From: Georges Martin (georges.martin@deboeck.be)
Date: Tue Apr 06 1999 - 09:13:50 EDT


>I need to make an HTML document with UTF-8 encoding. I'm on a Mac running
>Mac OS 8.5, and to my knowledge I have no tools which will allow me to just
>format UTF-8, so I will have to type in the double-byte characters it by
>hand.

The following two scripts re-encode the selected text in BBEdit to a new file or to the clipboard.

By default, it converts from MacOS Roman to UTF-8, but if you press the Command-Key, you can change these default source and target encodings.

You need the following Scripting Additions:

- "TEC" for the commands "TECGetEncodings" and "TECConvertText" :

  <http://www.scriptweb.com/osaxen/tec.html>

- "Jon's Commands" for the commands "keys pressed" and "set the clipboard to" :

  <http://www.scriptweb.com/osaxen/jons_commands.html>

- the "Choose item", included with the System.

Just drop the additions onto the System folder and the two compiled scripts to the "BBEdit Scripts" folder.

Georges Martin

ps: for some reason I don't understand, the "native" encoding, i.e. "macintosh", is not returned by the TECGetEncoding command. Hence the "choose item ((TECGetEncodings) & "macintosh")".

+-----------------------------------------------------------------------+

-- Re-encode to new file

property fromEncoding : "macintosh"
property toEncoding : "UTF-8"
property prefsKey : "Command"

tell application "BBEdit 4.5"
        
        if (the (keys pressed) contains prefsKey) then
                
                set fromEncoding to choose item ((TECGetEncodings) & "macintosh")
                        with prompt "Select the Source Encoding:" default item fromEncoding
                
                set toEncoding to choose item ((TECGetEncodings) & "macintosh")
                        with prompt "Select the Target Encoding:" default item toEncoding
                
        end if
        
        try
                set theSelection to the selected text of first window
                make new document with properties {contents:(TECConvertText theSelection fromCode fromEncoding toCode toEncoding)}
                
        on error theErrorMsg
                
                beep
                display dialog theErrorMsg buttons {"OK"} default button 1
                return
                
        end try
        
end tell

+-----------------------------------------------------------------------+

-- Re-encode to clipboard

property fromEncoding : "macintosh"
property toEncoding : "UTF-8"
property prefsKey : "Command"

tell application "BBEdit 4.5"
        
        if (the (keys pressed) contains prefsKey) then
                
                set fromEncoding to choose item ((TECGetEncodings) & "macintosh")
                        with prompt "Select the Source Encoding:" default item fromEncoding
                
                set toEncoding to choose item ((TECGetEncodings) & "macintosh")
                        with prompt "Select the Target Encoding:" default item toEncoding
                
        end if
        
        try
                set theSelection to the selected text of first window
                set the clipboard to (TECConvertText theSelection fromCode fromEncoding toCode toEncoding)
                
        on error theErrorMsg
                
                beep
                display dialog theErrorMsg buttons {"OK"} default button 1
                return
                
        end try
        
end tell

+-----------------------------------------------------------------------+

+-----------------------------------------------------------------------+
 Georges Martin <mailto:georges.martin@deboeck.be>
+-----------------------------------------------------------------------+
 Groupe De Boeck s.a. phone +32 10 48.25.72
 Fond Jean Paques 4 fax +32 10 48.25.19
 B-1348 Louvain-la-Neuve (Belgium) icq 32724889
+-----------------------------------------------------------------------+



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:45 EDT