|
UTF-32 (or UCS-4) stands for Unicode Transformation Format 32 bits. It is a protocol to encode Unicode code points that uses exactly 32 bits per Unicode code point. This makes UTF-32 a fixed-length encoding, in contrast to all other ''Unicode transformation formats'' which are variable-length encodings. The UTF-32 form of a code point is a direct representation of that code point's numerical value.〔SIL, (Mapping code points to Unicode encoding forms ), §1: UTF-32〕 The main advantage of UTF-32, versus variable-length encodings, is that the Unicode code points are directly indexable. Examining the n'th code point is a constant time operation.〔http://www.ibm.com/developerworks/xml/library/x-utf8/〕 In contrast, a variable-length code requires sequential access to find the n'th code point. This makes UTF-32 a simple replacement in code that uses integers to index characters out of strings, as was commonly done for ASCII. The main disadvantage of UTF-32 is that it is space inefficient, using four bytes per code point. Non-BMP characters are so rare in most texts, they may as well be considered non-existent for sizing issues, making UTF-32 up to twice the size of UTF-16 and up to four times the size of UTF-8. ==History== The original ISO 10646 standard defines a 31-bit ''encoding form'' called UCS-4, in which each encoded character in the Universal Character Set (UCS) is represented by a 32-bit friendly ''code value'' in the ''code space'' of integers between 0 and hexadecimal 7FFFFFFF. Because only 17 planes are actually in use, all current code points are between 0 and 0x10FFFF. UTF-32 is a subset of UCS-4 that uses only this range. Since the Principles and Procedures document of JTC1/SC2/WG2 states that all future assignments of characters will be constrained to the BMP or the first 14 supplementary planes, UTF-32 will be able to represent all Unicode characters. Accordingly, UCS-4 and UTF-32 are now identical except that the UTF-32 standard has additional Unicode semantics. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「UTF-32」の詳細全文を読む スポンサード リンク
|