Disclaimer: All information collected here is verified only for the most recent supported operating systems (currently Scientific Linux 5 and Windows XP) and for the most recent versions of centrally installed software (usually RPM's under Linux and Netinstall packages under XP).
Why UTF8
It is best explained in numerous articles in the internet such as the UTF8 and Unicode FAQ
Using UTF8
See the chapter 4.3 on Language support,UTF8 in SL5_User_Information for information on the current setup and how to change it. Basically to get full utf8 support you should create a file ~/.i18n containing the line LANG=en_US.UTF-8. Then after logging out and in again (or in a new window) you will have UTF-8 support.
UTF8 support in programs
To test the UTF8 awareness of different programs you can download an excellent UTF8 demo file
Terminal programs
test the programs by issuing the command
cat UTF-8-demo.txt
in the terminalxterm
- you have to use an UTF-8 Locale such as en_US.UTF-8 (see above)
- you have to select an ISO10646-1 font for the xterm
both can be achieved with the command
LANG=en_US.UTF-8 xterm -fn '-Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO10646-1' &
konsole and gnome-terminal
- if you use an UTF-8 Locale such as en_US.UTF-8 (see above) it works out of the box, otherwise you have to change the encoding to UTF-8 in the settings and terminal menu respectively
Surprisingly the most complete display of the test file was obtained using xterm with the given font, gnome-terminal was only slightly worse and konsole gave an output with visible deficiencies.
Printing
Our printing system CUPS can handle UTF-8. Unfortunately raw text cannot be printed directly. It has either to pass a cups filter named texttops which is not UTF-8 aware or be converted by the user to a PDF or PS document. The best choice is to input the text into ooffice and then use the print dialog. Please make sure the properties for the printer are correct (A4 page format). Due to a bug in ooffice do not attempt to print twice from the same ooffice window. Other text to ps converters such as a2ps and mp (used in pine only) are not UTF-8 ready.
Mail readers
- test the mail readers by sending the UTF-8-demo.txt file as an attachment to yourself
pine
- very incomplete UTF-8 support, not suitable for proper display of mails with varying encodings, not recommended
alpine
you have to use an UTF8-capable terminal and to use an UTF-8 Locale (see above)
- as you can verify with the test file, UTF-8 support is fairly good but still far from being complete
thunderbird
- versions 1.5.0.12 and above should work out of the box
- If you are not able to view UTF-8 encoded mails and e.g.latin1 encoded mails properly, two thunderbird options should be inspected: You must uncheck the "Apply the default character encoding to all incoming messages" in both the settings and the folder specific dialogs:
Other software
file
the file command does indicate if a given file is a text file containing unicode characters:
> file myfile myfile: UTF-8 Unicode text
A perl implementation of the algorithm used in file can be found in /products/scripts/isutf
less
- to display UTF-8 text the ENV variable LESSCHARSET must be set to utf-8 and you have to use an UTF-8 capable terminal (see above)
- a test with less UTF-8-demo.txt shows the same results as tests
Unsolved problems
- printing UTF8 from (al)pine