Unicode 5.0 was released in 2006. The previous version was
Unicode 4.1 and the current version is
Unicode 5.1.
All the gory details can be found at http://www.unicode.org/versions/Unicode5.0.0/
Unicode 5.0 adds 1369 new characters, most of which are in the 9 new code blocks : NKo, Balinese, Latin Extended-C, Latin Extended-D, Phags-pa, Phoenician, Cuneiform, Cuneiform Numbers and Punctuation and Counting Rod Numerals.
The changes from 4.1 include the following.
New Code Blocks
9 new
code blocks were added in 5.0
U+07C0 to U+07FF NKo 59/64
U+1B00 to U+1B7F Balinese 121/128
U+2C60 to U+2C7F Latin Extended C 17/32
U+A720 to U+A7FF Latin Extended D 2/224
U+A840 to U+A87F Phags pa 56/64
U+10900 to U+1091F Phoenician 27/32
U+12000 to U+123FF Cuneiform 879/1024
U+12400 to U+1247F Cuneiform Numbers and Punctuation 103/128
U+1D360 to U+1D37F Counting Rod Numerals 18/32
New Characters
Excluding those in the new
code blocks, there were 87 new characters added in Unicode 5.0
Number of characters in each General Category :
Letter, Uppercase Lu : 14
Letter, Lowercase Ll : 18
Letter, Modifier Lm : 4
Letter, Other Lo : 4
Mark, Non-Spacing Mn : 16
Symbol, Math Sm : 10
Symbol, Other So : 21
Number of characters in each Bidirectional Category :
Left To Right L : 36
Non Spacing Mark NSM : 16
Other Neutral ON : 35
The columns below should be interpreted as :
- The Unicode code for the character
- The character in question
- The Unicode name for the character
- The Unicode General Category for the character
- The Unicode Bidirectional Category for the character
If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.
Latin Extended B
Miscellaneous additions
- U+0242 ɂ Latin small letter glottal stop Ll L
- * casing use in Chipewyan, Dogrib, Slavey (Canadian aboriginal orthographies)
- ref U+0294 ʔ Latin letter glottal stop (IPA Extensions)
- ref U+02C0 ˀ modifier letter glottal stop (Spacing Modifier Letters)
- U+0243 Ƀ Latin capital letter B with stroke Lu L
- * lowercase is 0180
- U+0244 Ʉ Latin capital letter U bar Lu L
- * lowercase is 0289
- U+0245 Ʌ Latin capital letter turned v Lu L
- * lowercase is 028C
- U+0246 Ɇ Latin capital letter E with stroke Lu L
- U+0247 ɇ Latin small letter E with stroke Ll L
- U+0248 Ɉ Latin capital letter J with stroke Lu L
- U+0249 ɉ Latin small letter J with stroke Ll L
- U+024A Ɋ Latin capital letter small q with hook tail Lu L
- U+024B ɋ Latin small letter Q with hook tail Ll L
- U+024C Ɍ Latin capital letter R with stroke Lu L
- U+024D ɍ Latin small letter R with stroke Ll L
- U+024E Ɏ Latin capital letter Y with stroke Lu L
- U+024F ɏ Latin small letter Y with stroke Ll L
Greek and Coptic
Lowercase of editorial symbols
- U+037B ͻ Greek small reversed lunate sigma symbol Ll L
- U+037C ͼ Greek small dotted lunate sigma symbol Ll L
- U+037D ͽ Greek small reversed dotted lunate sigma symbol Ll L
Cyrillic
Extended Cyrillic
- U+04CF ӏ Cyrillic small letter palochka Ll L
Additions for Nivkh
- U+04FA Ӻ Cyrillic capital letter ghe with stroke and hook Lu L
- U+04FB ӻ Cyrillic small letter ghe with stroke and hook Ll L
- U+04FC Ӽ Cyrillic capital letter ha with hook Lu L
- U+04FD ӽ Cyrillic small letter ha with hook Ll L
- U+04FE Ӿ Cyrillic capital letter ha with stroke Lu L
- U+04FF ӿ Cyrillic small letter ha with stroke Ll L
Cyrillic Supplement
Cyrillic extensions
- U+0510 Ԑ Cyrillic capital letter reversed ze Lu L
- U+0511 ԑ Cyrillic small letter reversed ze Ll L
- U+0512 Ԓ Cyrillic capital letter el with hook Lu L
- U+0513 ԓ Cyrillic small letter el with hook Ll L
Hebrew
Points and punctuation
- U+05BA ֺ Hebrew point holam haser for vav Mn NSM
Devanagari
Sindhi implosives
These are added from Amendment 3 to 10646:2003.
- U+097B ॻ Devanagari letter gga Lo L
- U+097C ॼ Devanagari letter jja Lo L
Sindhi implosives
These are added from Amendment 3 to 10646:2003.
- U+097E ॾ Devanagari letter ddda Lo L
- U+097F ॿ Devanagari letter bba Lo L
Kannada
Additional vowels for Sanskrit
- U+0CE2 ೢ Kannada vowel sign vocalic l Mn NSM
- U+0CE3 ೣ Kannada vowel sign vocalic ll Mn NSM
Various signs
- U+0CF1 ೱ Kannada sign jihvamuliya So ON
- U+0CF2 ೲ Kannada sign upadhmaniya So ON
Combining Diacritical Marks Supplement
Contour tone marks
- U+1DC4 ᷄ combining macron acute Mn NSM
- U+1DC5 ᷅ combining grave macron Mn NSM
- U+1DC6 ᷆ combining macron grave Mn NSM
- U+1DC7 ᷇ combining acute macron Mn NSM
- U+1DC8 ᷈ combining grave acute grave Mn NSM
- U+1DC9 ᷉ combining acute grave acute Mn NSM
Miscellaneous mark
- U+1DCA ᷊ combining Latin small letter R below Mn NSM
Additional marks for UPA
- U+1DFE ᷾ combining left arrowhead above Mn NSM
- U+1DFF ᷿ combining right arrowhead and down arrowhead below Mn NSM
Combining Diacritical Marks for Symbols
Additional diacritical marks for symbols
- U+20EC ⃬ combining rightwards harpoon with barb downwards Mn NSM
- U+20ED ⃭ combining leftwards harpoon with barb downwards Mn NSM
- U+20EE ⃮ combining left arrow below Mn NSM
- U+20EF ⃯ combining right arrow below Mn NSM
Letterlike Symbols
Additional letterlike symbols
- U+214D ⅍ aktieselskab So ON
- ref U+2101 ℁ addressed to the subject (Letterlike Symbols)
Lowercase Claudian letter
Claudian letters in inscriptions are uppercase, but may be transcribed by scholars in lowercase.
- U+214E ⅎ turned small f Ll L
- * uppercase is 2132
- ref U+03DD ϝ Greek small letter digamma (Greek and Coptic)
Number Forms
Lowercase Claudian letter
Claudian letters in inscriptions are uppercase, but may be transcribed by scholars in lowercase.
- U+2184 ↄ Latin small letter reversed c Ll L
- ref U+037B ͻ Greek small reversed lunate sigma symbol (Greek and Coptic)
Miscellaneous Technical
Horizontal brackets
These are intended for bracketing terms of mathematical expressions where their glyph extends to accommodate the width of the bracketed expression
- U+23DC ⏜ top parenthesis Sm ON
- ref U+FE35 ︵ presentation form for vertical left parenthesis (CJK Compatibility Forms)
- U+23DD ⏝ bottom parenthesis Sm ON
- ref U+FE36 ︶ presentation form for vertical right parenthesis (CJK Compatibility Forms)
- U+23DE ⏞ top curly bracket Sm ON
- ref U+FE37 ︷ presentation form for vertical left curly bracket (CJK Compatibility Forms)
- U+23DF ⏟ bottom curly bracket Sm ON
- ref U+FE38 ︸ presentation form for vertical right curly bracket (CJK Compatibility Forms)
- U+23E0 ⏠ top tortoise shell bracket Sm ON
- ref U+FE39 ︹ presentation form for vertical left tortoise shell bracket (CJK Compatibility Forms)
- U+23E1 ⏡ bottom tortoise shell bracket Sm ON
- ref U+FE3A ︺ presentation form for vertical right tortoise shell bracket (CJK Compatibility Forms)
Miscellaneous technical
- U+23E2 ⏢ white trapezium So ON
Chemistry symbol
- U+23E3 ⏣ benzene ring with circle So ON
Miscellaneous technical
- U+23E4 ⏤ straightness So ON
- U+23E5 ⏥ flatness So ON
- U+23E6 ⏦ ac current So ON
- U+23E7 ⏧ electrical intersection So ON
Miscellaneous Symbols
Gender symbol
- U+26B2 ⚲ neuter So ON
Miscellaneous Mathematical Symbols A
Miscellaneous symbols
- U+27C7 ⟇ or with dot inside Sm ON
- U+27C8 ⟈ reverse solidus preceding subset Sm ON
- U+27C9 ⟉ superset preceding solidus Sm ON
Vertical line operator
- U+27CA ⟊ vertical bar with horizontal stroke Sm ON
- ref U+2AF2 ⫲ parallel with horizontal stroke (Supplemental Mathematical Operators)
- ref U+2AF5 ⫵ triple vertical bar with horizontal stroke (Supplemental Mathematical Operators)
Miscellaneous Symbols and Arrows
Squares
- U+2B14 ⬔ square with upper right diagonal half black So ON
- U+2B15 ⬕ square with lower left diagonal half black So ON
Diamonds
- U+2B16 ⬖ diamond with left half black So ON
- U+2B17 ⬗ diamond with right half black So ON
- U+2B18 ⬘ diamond with top half black So ON
- U+2B19 ⬙ diamond with bottom half black So ON
Square
- U+2B1A ⬚ dotted square So ON
Pentagon
- U+2B20 ⬠ white pentagon So ON
Hexagons
- U+2B21 ⬡ white hexagon So ON
- U+2B22 ⬢ black hexagon So ON
- U+2B23 ⬣ horizontal black hexagon So ON
Modifier Tone Letters
Chinantec tone marks
- U+A717 ꜗ modifier letter dot vertical bar Lm ON
- U+A718 ꜘ modifier letter dot slash Lm ON
- U+A719 ꜙ modifier letter dot horizontal bar Lm ON
- U+A71A ꜚ modifier letter lower right corner angle Lm ON
Mathematical Alphanumeric Symbols
Additional bold Greek symbols
- U+1D7CA 𝟊 mathematical bold capital digamma Lu L
- U+1D7CB 𝟋 mathematical bold small digamma Ll L
Altered Characters
In addition, 13 characters were altered in 5.0
A total of 8 characters changed their General Category
1 characters changed their General Category from Letter, Lowercase to Letter, Other
1 characters changed their General Category from Letter, Other to Number, Letter
1 characters changed their General Category from Number, Letter to Letter, Uppercase
1 characters changed their General Category from Punctuation, Open to Symbol, Other
1 characters changed their General Category from Punctuation, Close to Symbol, Other
1 characters changed their General Category from Punctuation, Other to Symbol, Other
1 characters changed their General Category from Symbol, Other to Letter, Uppercase
1 characters changed their General Category from Symbol, Other to Punctuation, Other
A total of 6 characters changed their Bidirectional Category
6 characters changed their Bidirectional Category from Other Neutral to Left To Right
IPA Extensions
U+0294
ʔ Latin letter glottal stop had its
General Category changed from
Letter, Lowercase to
Letter, Other
Letterlike Symbols
U+2132
Ⅎ turned capital f had its
General Category changed from
Symbol, Other to
Letter, Uppercase
U+2132
Ⅎ turned capital f had its
Bidirectional Category changed from
Other Neutral to
Left To Right
Number Forms
U+2183
Ↄ Roman numeral reversed one hundred had its
General Category changed from
Number, Letter to
Letter, Uppercase
Miscellaneous Technical
U+23B4
⎴ top square bracket had its
General Category changed from
Punctuation, Open to
Symbol, Other
U+23B5
⎵ bottom square bracket had its
General Category changed from
Punctuation, Close to
Symbol, Other
U+23B6
⎶ bottom square bracket over top square bracket had its
General Category changed from
Punctuation, Other to
Symbol, Other
Gothic
U+10341
𐍁 Gothic letter ninety had its
General Category changed from
Letter, Other to
Number, Letter
Old Persian
U+103D0
𐏐 old persian word divider had its
General Category changed from
Symbol, Other to
Punctuation, Other
U+103D1
𐏑 old persian number one had its
Bidirectional Category changed from
Other Neutral to
Left To Right
U+103D2
𐏒 old persian number two had its
Bidirectional Category changed from
Other Neutral to
Left To Right
U+103D3
𐏓 old persian number ten had its
Bidirectional Category changed from
Other Neutral to
Left To Right
U+103D4
𐏔 old persian number twenty had its
Bidirectional Category changed from
Other Neutral to
Left To Right
U+103D5
𐏕 old persian number hundred had its
Bidirectional Category changed from
Other Neutral to
Left To Right
http://unicode.org
Some prose may have been lifted verbatim from unicode.org,
as is permitted by their terms of use at http://www.unicode.org/copyright.html