以文本方式查看主题 - 中文XML论坛 - 专业的XML技术讨论区 (http://bbs.xml.org.cn/index.asp) -- 『 Web架构 』 (http://bbs.xml.org.cn/list.asp?boardid=66) ---- Unicode 5.0 发表 - 支持最新版的 GB 18030 (http://bbs.xml.org.cn/dispbbs.asp?boardid=66&rootid=&id=37603) |
-- 作者:admin -- 发布时间:9/5/2006 2:06:00 PM -- Unicode 5.0 发表 - 支持最新版的 GB 18030 详见:http://www.unicode.org/versions/Unicode5.0.0/ Unicode 5.0.0 Unicode 5.0.0 is a [URL=http://www.unicode.org/versions/]major version[/URL] of the Unicode Standard and supersedes all previous versions. The publication of the book, The Unicode Standard, Version 5.0, is pending and is expected in the fourth quarter of 2006. However, all of the [URL=http://www.unicode.org/Public/5.0.0/ucd/]online data files[/URL] for version 5.0 of the [URL=http://www.unicode.org/ucd/]Unicode Character Database[/URL] are stable and final. In order to provide an opportunity for developers to develop Unicode 5.0 as soon as possible, these data files have been released ahead of the publication of the text of the standard. The text of the Unicode Standard Annexes for Version 5.0 is currently in copy edit; online versions of these will also be available in the fourth quarter of 2006. The Unicode Standard Annexes will also be published in the book. Version 5.0.0 of the Unicode Standard consists of the publication The Unicode Standard, Version 5.0 plus the Unicode Character Database, Version 5.0.0. The book gives the general principles, requirements for conformance, and guidelines for implementers, followed by character code charts and names and the text of all of the Unicode Standard Annexes. To order The Unicode Standard, Version 5.0, see the [URL=http://www.unicode.org/book/bookform.html]online order form[/URL]. A complete specification of the contributory files for Unicode 5.0.0 is found on [URL=http://www.unicode.org/versions/components-5.0.0.html]the Components page[/URL]. Version 5.0.0 of the Unicode Standard should be referenced as: The Unicode Consortium. The Unicode Standard, Version 5.0.0, defined by: The Unicode Standard, Version 5.0 (Boston, MA, Addison-Wesley, 2007. ISBN 0-321-48091-0) Online Edition Final character code charts for Version 5.0 will be available online soon. What's New in Version 5.0 For stability of protocols on the Internet and elsewhere, Unicode 5.0 also makes changes to guarantee case-folding stability. Unicode 5.0 incorporates all the changes introduced in Unicode 4.1, including full interoperability with the most recent versions of GB 18030, JIS X 0213, and HKSCS, and support for stable identifiers and pattern syntax characters. Unicode 5.0 revises and improves property values and behavioral specifications in areas such as character, word, line, and sentence segmentation, and tightens conformance requirements on Bidi implementations (used for Arabic and Hebrew). The text is significantly revised for clarity and completeness, especially for Unicode conformance. Unicode 5.0 covers the full repertoire of ISO/IEC 10646:2003, including Amendments 1 and 2, which add characters required for some languages of India, for mathematicians, for minority languages, and for academic use. The Unicode Standard is closely connected with other Unicode software globalization standards in such key areas as collation (used for sorting, searching, and matching), character set conversion, regular expressions, and the interchange and registration of locale data for the world's languages and local cultural conventions [[URL=http://www.unicode.org/cldr/]CLDR[/URL]]. It has been further significantly augmented by several new Unicode Technical Standards that provide recommendations and data to assist in secure implementation of Unicode, and to establish the registration mechanism for Ideographic Variation Sequences needed by the publishing industry for Chinese and Japanese. Other major additions to Version 5.0 since Version 4.0 are discussed in the sections below. New Characters The new character additions were to both the BMP and the SMP (Plane 1). The following table shows the allocation of code points in Unicode 5.0.0. For more information on the specific characters, see the file [URL=http://www.unicode.org/Public/UNIDATA/DerivedAge.txt]DerivedAge.txt[/URL] in the [URL=http://www.unicode.org/ucd/]Unicode Character Database[/URL]. Graphic 98,884 The character repertoire corresponds to ISO/IEC 10646:2003 plus Amendment 1, Amendment 2, and four Sindhi characters from Amendment 3. For more details of character counts, see Appendix D, Changes from Unicode Version 4.0. Unicode Character Database Scripts. Unassigned code points were given a new Script property value of "Zzzz": this may require some change in code using this property. Three Mongolian punctuation marks and two archaic letters changed script value. Bidirectional Behavior. The list of characters with the Bidi_Mirrored property was made consistent for brackets and quotation marks, in preparation for new constraints on bidi mirroring. The Bidi_Class property for five archaic characters was changed to L. Numeric Properties. The archaic character U+10341 GOTHIC LETTER NINETY was given the numeric value 90. Conformance Chapter 3, Conformance, was substantially improved by incorporating much of the Unicode Property Model, enhancing the treatment of combining characters, and further clarifying canonical ordering behavior through the addition of clearly defined principles. Additionally, conformance clauses and definitions were renumbered for overall readability and clarity of the text. Significant clarifications or modifications to character behavior include those listed below: Stability of Cased Letters. If uppercase characters are added in cased scripts, the corresponding lowercase characters will be added as well, so that case folding is stable. |
-- 作者:byltd -- 发布时间:1/14/2007 9:49:00 AM -- 有中文版的吗? |
W 3 C h i n a ( since 2003 ) 旗 下 站 点 苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》 |
4,078.125ms |