Categories:
.NET (357)
C (330)
C++ (183)
CSS (84)
DBA (2)
General (7)
HTML (4)
Java (574)
JavaScript (106)
JSP (66)
Oracle (114)
Perl (46)
Perl (1)
PHP (1)
PL/SQL (1)
RSS (51)
Software QA (13)
SQL Server (1)
Windows (1)
XHTML (173)
Other Resources:
Using XML with Non-Latin Characters
Can XML use non-Latin charac'ters?
✍: FYICENTER.com
Yes. The XML Specification explicitly says XML uses ISO 10646, the international standard character repertoire which covers most known languages. Unicode is an identical repertoire, and the two standards track each other. The spec says (2.2): All XML processors must accept the UTF-8 and UTF-16 encodings of ISO 10646. There is a Unicode FAQ at http://www.unicode.org/faq/FAQ.
UTF-8 is an encoding of Unicode into 8-bit characters: the first 128 are the same as ASCII, and higher-order characters are used to encode anything else from Unicode into sequences of between 2 and 6 bytes. UTF-8 in its single-octet form is therefore the same as ISO 646 IRV (ASCII), so you can continue to use ASCII for English or other languages using the Latin alphabet without diacritics. Note that UTF-8 is incompatible with ISO 8859-1 (ISO Latin-1) after code point 127 decimal (the end of ASCII).
UTF-16 is an encoding of Unicode into 16-bit characters, which lets it represent 16 planes. UTF-16 is incompatible with ASCII because it uses two 8-bit bytes per character (four bytes above U+FFFF).
2007-04-11, 5235👍, 0💬
Popular Posts:
Which bit wise operator is suitable for turning on a particular bit in a number? The bitwise OR oper...
Explain in detail the fundamental of connection pooling? When a connection is opened first time a co...
How To Give a User Read-Only Access to a Database? - MySQL FAQs - Managing User Accounts and Access ...
How To Enter Binary Numbers in SQL Statements? - MySQL FAQs - Introduction to SQL Basics If you want...
How do we enable SQL Cache Dependency in ASP.NET 2.0? Below are the broader steps to enable a SQL Ca...