|
|
Page 1 of 1
|
[ 5 posts ] |
|
| Author |
Message |
|
ravi99249
|
Post subject: New/extra character gets introduced on opening file in Unix Posted: Mon Jun 18, 2012 2:14 pm |
|
Joined: Mon Jun 18, 2012 6:47 am Posts: 2
|
|
I created a text file in Windows using notepad. It had only one character (¬). The ASCII code of this character is 172. I saved the file in UNICODE encoding. When I copied this file to a UNIX system and open it in vi editor, it shows 2 characters in the file (¬). The ASCII code of the newly introduced character is 194.
I do not want the new character to appear in Unix. Am I doing something wrong here or I am supposed to do things differently.
Please advise.
Thanks
Ravi
|
|
| Top |
|
 |
|
asmus
|
Post subject: Re: New/extra character gets introduced on opening file in U Posted: Mon Jun 18, 2012 5:52 pm |
|
 |
| Unicode Guru |
Joined: Tue Dec 01, 2009 2:49 pm Posts: 172
|
|
What you are seeing is the UTF-8 form of Unicode, where this character has two bytes.
If your vi is set to work in Latin-1 instead of UTF-8, you would mistakenly see two "characters".
If you need to create files to work with that editor, you need to use Windows ANSI or 1252 code page, or, alternatively, you need to change your editor to accept UTF-8. I cannot tell you how to do that, because I don't own a Unix system.
|
|
| Top |
|
 |
|
vanisaac
|
Post subject: Re: New/extra character gets introduced on opening file in U Posted: Mon Jun 18, 2012 6:50 pm |
|
Joined: Mon Feb 01, 2010 6:18 pm Posts: 76
|
So basically, vi is interpreting the incoming file as ANSI, rather than UTF-8, so it runs an ANSI to UTF-8 conversion on an already UTF-8 stream. As such, the single character ¬, which is encoded in UTF-8 as 0xC2 0xAC is instead taken as the two characters U+00C2 and U+00AC. Unix and Notepad are known to not play well together ( http://blogs.msdn.com/b/michkap/archive ... 57028.aspx). Try the freeware Notepad++ ( http://notepad-plus-plus.org/), which allows you to explicitly specify your encoding form, Byte-Order-Mark, and End-of-Line preferences. While setting your vi settings can help with this particular problem, editing plain text in Notepad for Unix consumption is generally to be avoided on spec.
|
|
| Top |
|
 |
|
ravi99249
|
Post subject: Re: New/extra character gets introduced on opening file in U Posted: Tue Jun 19, 2012 8:05 am |
|
Joined: Mon Jun 18, 2012 6:47 am Posts: 2
|
|
Thank you for your inputs. I do not have control over the editor the team uses.
The character '¬' is used as a separator character in a SQL script file. But the moment someone edits this file on a editor in Unix, and saves back, the additional character gets introduced.
Specifically the problem occurs in vi and SQL plus in Unix.
Is there something I can instruct the Unix users that prevents the additional character from getting into the file?
|
|
| Top |
|
 |
|
vanisaac
|
Post subject: Re: New/extra character gets introduced on opening file in U Posted: Tue Jun 19, 2012 12:23 pm |
|
Joined: Mon Feb 01, 2010 6:18 pm Posts: 76
|
|
You need to tell them that they need to alter the settings of vi (like Asmus, I don't use it, so I don't know how to do this) so that it reads the input files as Unicode, not as ANSI. It is running a conversion on the files that it shouldn't be doing. The only other solution is to have a sanitizing app that users can run that replaces all occurrences of ¬ with ¬, and removes any occurrences of ÂÂ, which will happen if a file gets edited by vi, sent back to Notepad, edited there, and back to vi, as each  and ¬ will get expanded to  and ¬.
|
|
| Top |
|
 |
|
Page 1 of 1
|
[ 5 posts ] |
|
Who is online |
Users browsing this forum: No registered users and 0 guests |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|
|