Table E-1lists the suggested charset(s) for a number of languages. Charsets are used by servlets that generate multilingual output; they determine which character encoding a servlet's PrintWriter is to use. By default, the PrintWriter uses the ISO-8859-1 (Latin-1) charset, appropriate for most Western European languages. To specify an alternate charset, the charset value must be passed to the setContentType() method before the servlet retrieves its PrintWriter. For example:
res.setContentType("text/html; charset=Shift_JIS");  // A Japanese charset
PrintWriter out = res.getWriter();  // Writes Shift_JIS Japanese
Note that not all web browsers support all charsets or have the fonts available to represent all characters, although at minimum all clients support ISO-8859-1. Also, the UTF-8 charset can represent all Unicode characters and may be assumed a viable alternative for all languages.
| 
 Language  | 
 Language Code  | 
 Suggested Charsets  | 
|---|---|---|
| 
 Albanian  | 
 sq  | 
 ISO-8859-2  | 
| 
 Arabic  | 
 ar  | 
 ISO-8859-6  | 
| 
 Bulgarian  | 
 bg  | 
 ISO-8859-5  | 
| 
 Byelorussian  | 
 be  | 
 ISO-8859-5  | 
| 
 Catalan (Spanish)  | 
 ca  | 
 ISO-8859-1  | 
| 
 Chinese (Simplified/Mainland)  | 
 zh  | 
 GB2312  | 
| 
 Chinese (Traditional/Taiwan)  | 
 zh (country TW)  | 
 Big5  | 
| 
 Croatian  | 
 hr  | 
 ISO-8859-2  | 
| 
 Czech  | 
 cs  | 
 ISO-8859-2  | 
| 
 Danish  | 
 da  | 
 ISO-8859-1  | 
| 
 Dutch  | 
 nl  | 
 ISO-8859-1  | 
| 
 English  | 
 en  | 
 ISO-8859-1  | 
| 
 Estonian  | 
 et  | 
 ISO-8859-1  | 
| 
 Finnish  | 
 fi  | 
 ISO-8859-1  | 
| 
 French  | 
 fr  | 
 ISO-8859-1  | 
| 
 German  | 
 de  | 
 ISO-8859-1  | 
| 
 Greek  | 
 el  | 
 ISO-8859-7  | 
| 
 Hebrew  | 
 he (formerly iw)  | 
 ISO-8859-8  | 
| 
 Hungarian  | 
 hu  | 
 ISO-8859-2  | 
| 
 Icelandic  | 
 is  | 
 ISO-8859-1  | 
| 
 Italian  | 
 it  | 
 ISO-8859-1  | 
| 
 Japanese  | 
 ja  | 
 Shift_JIS, ISO-2022-JP, EUC-JP[1]  | 
| 
 Korean  | 
 ko  | 
 EUC-KR[2]  | 
| 
 Latvian, Lettish  | 
 lv  | 
 ISO-8859-2  | 
| 
 Lithuanian  | 
 lt  | 
 ISO-8859-2  | 
| 
 Macedonian  | 
 mk  | 
 ISO-8859-5  | 
| 
 Norwegian  | 
 no  | 
 ISO-8859-1  | 
| 
 Polish  | 
 pl  | 
 ISO-8859-2  | 
| 
 Portuguese  | 
 pt  | 
 ISO-8859-1  | 
| 
 Romanian  | 
 ro  | 
 ISO-8859-2  | 
| 
 Russian  | 
 ru  | 
 ISO-8859-5, KOI8-R  | 
| 
 Serbian  | 
 sr  | 
 ISO-8859-5, KOI8-R  | 
| 
 Serbo-Croatian  | 
 sh  | 
 ISO-8859-5, ISO-8859-2, KOI8-R  | 
| 
 Slovak  | 
 sk  | 
 ISO-8859-2  | 
| 
 Slovenian  | 
 sl  | 
 ISO-8859-2  | 
| 
 Spanish  | 
 es  | 
 ISO-8859-1  | 
| 
 Swedish  | 
 sv  | 
 ISO-8859-1  | 
| 
 Turkish  | 
 tr  | 
 ISO-8859-9  | 
| 
 Ukranian  | 
 uk  | 
[1] First supported in JDK 1.1.6. Earlier versions of the JDK know the EUC-JP character set by the name EUCJIS, so for portability you can set the character set to EUC-JP and manually construct an EUCJIS PrintWriter.
[2] First supported in JDK 1.1.6. Earlier versions of the JDK know the EUC-KR character set by the name KSC_5601, so for portability you can set the character set to EUC-KR and manually construct a KSC_5601 PrintWriter.

Copyright © 2001 O'Reilly & Associates. All rights reserved.