Version 5.2 of the Unicode standard was released on October 1, 2009. The previous version was Unicode 5.1 and the next version is Unicode 6.0. All the gory details can be found at http://www.unicode.org/versions/Unicode5.2.0/

Unicode 5.2 adds 6,648 new characters, including 26 new code blocks, to support 7 new contemporary scripts and 6 new historic scripts.

Seven new contemporary scripts have been added : Bamum, Javanese, Lisu, Meetei Mayek, Samaritan, Tai Tham, and Tai Viet. New character additions to existing scripts now provide greater support for Abkhaz, Canadian Aboriginal Syllabics, Coptic, Devanagari, Khamti Shan, Malayalam, and Myanmar.

Unicode 5.2 now supports the Gardiner set of Egyptian Hieroglyphs as well as other important historic scripts: Imperial Aramaic, Avestan, Kaithi, Old South Arabian, and Old Turkic.

Unicode 5.2 has exactly the same character assignments as ISO/IEC 10646:2003 plus Amendments 1 through 6.


New Code Blocks

26 new code blocks were added in 5.2


U+0800 to U+083F   Samaritan 61/64
U+18B0 to U+18FF   Unified Canadian Aboriginal Syllabics Extended 70/80
U+1A20 to U+1AAF   Tai Tham 127/144
U+1CD0 to U+1CFF   Vedic Extensions 35/48
U+A4D0 to U+A4FF   Lisu 48/48
U+A6A0 to U+A6FF   Bamum 88/96
U+A830 to U+A83F   Common Indic Number Forms 10/16
U+A8E0 to U+A8FF   Devanagari Extended 28/32
U+A960 to U+A97F   Hangul Jamo Extended A 29/32
U+A980 to U+A9DF   Javanese 91/96
U+AA60 to U+AA7F   Myanmar Extended A 28/32
U+AA80 to U+AADF   Tai Viet 72/96
U+ABC0 to U+ABFF   Meetei Mayek 56/64
U+D7B0 to U+D7FF   Hangul Jamo Extended B 72/80
U+10840 to U+1085F   Imperial Aramaic 31/32
U+10A60 to U+10A7F   Old South Arabian 32/32
U+10B00 to U+10B3F   Avestan 61/64
U+10B40 to U+10B5F   Inscriptional Parthian 30/32
U+10B60 to U+10B7F   Inscriptional Pahlavi 27/32
U+10C00 to U+10C4F   Old Turkic 73/80
U+10E60 to U+10E7F   Rumi Numeral Symbols 31/32
U+11080 to U+110CF   Kaithi 66/80
U+13000 to U+1342F   Egyptian Hieroglyphs 1071/1072
U+1F100 to U+1F1FF   Enclosed Alphanumeric Supplement 63/256
U+1F200 to U+1F2FF   Enclosed Ideographic Supplement 44/256
U+2A700 to U+2B73F   CJK Unified Ideographs Extension C 4149/4160

New Characters

Excluding those in the new code blocks, there were 155 new characters added in Unicode 5.2

Number of characters in each General Category :

Letter, Uppercase        Lu :  6
Letter, Lowercase        Ll :  3
Letter, Other            Lo : 40
Mark, Non-Spacing        Mn :  7
Mark, Spacing Combining  Mc :  4
Number, Decimal Digit    Nd :  1
Number, Other            No :  6
Punctuation, Dash        Pd :  1
Punctuation, Other       Po :  1
Symbol, Currency         Sc :  4
Symbol, Other            So : 82

Number of characters in each Bidirectional Category :

Left To Right                 L : 70
Right To Left                 R :  2
European Number Terminator   ET :  4
Non Spacing Mark            NSM :  7
Other Neutral                ON : 72

The columns below should be interpreted as :

  1. The Unicode code for the character
  2. The character in question
  3. The Unicode name for the character
  4. The Unicode General Category for the character
  5. The Unicode Bidirectional Category for the character

If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.

 

Cyrillic Supplement

     Abkhaz letters

U+0524   Ԥ   Cyrillic capital letter pe with descender Lu L
U+0525   ԥ   Cyrillic small letter pe with descender Ll L
* used in modern Abkhaz orthography
ref U+04A7   ҧ   Cyrillic small letter pe with middle hook (Cyrillic)

 

Devanagari

     Various signs

U+0900   ऀ   Devanagari sign inverted candrabindu Mn NSM
aka vaidika adhomukha candrabindu

     Archaic dependent vowel sign

U+094E   ॎ   Devanagari vowel sign prishthamatra e Mc L
* character has historic use only
* combines with E to form AI, with AA to form O, and with O to form AU

     Accent marks

U+0955   ॕ   Devanagari vowel sign candra long e Mn NSM
* used in transliteration of Avestan

     Additional consonants

U+0979   ॹ   Devanagari letter zha Lo L
* used in transliteration of Avestan
U+097A   ॺ   Devanagari letter heavy ya Lo L
* used for an affricated glide JJYA

 

Bengali

     Bengali-specific additions

U+09FB   ৻   Bengali ganda mark Sc ET

 

Tibetan

     Religious symbols

U+0FD5   ࿕   right facing svasti sign So L
aka gyung drung nang -khor
* symbol of good luck and well-being in India
ref U+5350   卐   CJK Ideograph 5350 (CJK Unified Ideographs)
U+0FD6   ࿖   left facing svasti sign So L
aka gyung drung phyi -khor
ref U+534D   卍   CJK Ideograph 534D (CJK Unified Ideographs)
U+0FD7   ࿗   right facing svasti sign with dots So L
aka gyung drung nang -khor bzhi mig can
U+0FD8   ࿘   left facing svasti sign with dots So L
aka gyung drung phyi -khor bzhi mig can

 

Myanmar

     Extensions for Khamti Shan

U+109A   ႚ   Myanmar sign khamti tone 1 Mc L
U+109B   ႛ   Myanmar sign khamti tone 3 Mc L

     Extensions for Aiton and Phake

U+109C   ႜ   Myanmar vowel sign aiton a Mc L
U+109D   ႝ   Myanmar vowel sign aiton ai Mn NSM

 

Hangul Jamo

     Initial consonants

U+115A   ᅚ   Hangul choseong kiyeok tikeut Lo L
U+115B   ᅛ   Hangul choseong nieun sios Lo L
U+115C   ᅜ   Hangul choseong nieun cieuc Lo L
U+115D   ᅝ   Hangul choseong nieun hieuh Lo L
U+115E   ᅞ   Hangul choseong tikeut rieul Lo L

     Medial vowels

U+11A3   ᆣ   Hangul jungseong a eu Lo L
U+11A4   ᆤ   Hangul jungseong ya u Lo L
U+11A5   ᆥ   Hangul jungseong yeo ya Lo L
U+11A6   ᆦ   Hangul jungseong o ya Lo L
U+11A7   ᆧ   Hangul jungseong o yae Lo L

     Final consonants

U+11FA   ᇺ   Hangul jongseong kiyeok nieun Lo L
U+11FB   ᇻ   Hangul jongseong kiyeok pieup Lo L
U+11FC   ᇼ   Hangul jongseong kiyeok chieuch Lo L
U+11FD   ᇽ   Hangul jongseong kiyeok khieukh Lo L
U+11FE   ᇾ   Hangul jongseong kiyeok hieuh Lo L
U+11FF   ᇿ   Hangul jongseong ssangnieun Lo L

 

Unified Canadian Aboriginal Syllabics

     Punctuation

U+1400   ᐀   Canadian syllabics hyphen Pd ON

     Syllables

U+1677   ᙷ   Canadian syllabics woods cree thwee Lo L
U+1678   ᙸ   Canadian syllabics woods cree thwi Lo L
U+1679   ᙹ   Canadian syllabics woods cree thwii Lo L
U+167A   ᙺ   Canadian syllabics woods cree thwo Lo L
U+167B   ᙻ   Canadian syllabics woods cree thwoo Lo L
U+167C   ᙼ   Canadian syllabics woods cree thwa Lo L
U+167D   ᙽ   Canadian syllabics woods cree thwaa Lo L
U+167E   ᙾ   Canadian syllabics woods cree final th Lo L
U+167F   ᙿ   Canadian syllabics blackfoot w Lo L

 

New Tai Lue

     Consonants

U+19AA   ᦪ   new Tai lue letter high sua Lo L
U+19AB   ᦫ   new Tai lue letter low sua Lo L

     Digits

U+19DA   ᧚   new Tai lue tham digit one Nd L

 

Combining Diacritical Marks Supplement

     Miscellaneous mark

U+1DFD   ᷽   combining almost equal to below Mn NSM

 

Currency Symbols

     Currency symbols
A number of currency symbols are found in other blocks. Fullwidth versions of some currency symbols are found in the Halfwidth and Fullwidth Forms block.
see also U+0024   $   dollar sign (Basic Latin)
see also U+00A2   ¢   cent sign (Latin-1 Supplement)
see also U+00A3   £   pound sign (Latin-1 Supplement)
see also U+00A4   ¤   currency sign (Latin-1 Supplement)
see also U+00A5   ¥   yen sign (Latin-1 Supplement)
see also U+0192   ƒ   Latin small letter F with hook (Latin Extended B)
see also U+060B   ؋   afghani sign (Arabic)
see also U+09F2   ৲   Bengali rupee mark (Bengali)
see also U+09F3   ৳   Bengali rupee sign (Bengali)
see also U+0AF1   ૱   Gujarati rupee sign (Gujarati)
see also U+0BF9   ௹   Tamil rupee sign (Tamil)
see also U+0E3F   ฿   Thai currency symbol baht (Thai)
see also U+17DB   ៛   Khmer currency symbol riel (Khmer)
see also U+2133   ℳ   script capital m (Letterlike Symbols)
see also U+5143   元   CJK Ideograph 5143 (CJK Unified Ideographs)
see also U+5186   円   CJK Ideograph 5186 (CJK Unified Ideographs)
see also U+5706   圆   CJK Ideograph 5706 (CJK Unified Ideographs)
see also U+5713   圓   CJK Ideograph 5713 (CJK Unified Ideographs)
see also U+FDFC   ﷼   rial sign (Arabic Presentation Forms A)

U+20B6   ₶   livre tournois sign Sc ET
* used in France from 13th-18th centuries
U+20B7   ₷   spesmilo sign Sc ET
* historical international currency associated with Esperanto
U+20B8   ₸   tenge sign Sc ET
* Kazakhstan
ref U+2351   ⍑   APL functional symbol up tack overbar (Miscellaneous Technical)
ref U+2564   ╤   box drawings down single and horizontal double (Box Drawing)
ref U+3012   〒   postal mark (CJK Symbols and Punctuation)

 

Number Forms

     Fractions
Other fraction number forms are found in the Latin-1 Supplement block.
see also U+00BC   ¼   vulgar fraction one quarter (Latin-1 Supplement)
see also U+00BD   ½   vulgar fraction one half (Latin-1 Supplement)
see also U+00BE   ¾   vulgar fraction three quarters (Latin-1 Supplement)

U+2150   ⅐   vulgar fraction one seventh No ON
U+2151   ⅑   vulgar fraction one ninth No ON
U+2152   ⅒   vulgar fraction one tenth No ON

     Fraction

U+2189   ↉   vulgar fraction zero thirds No ON
* used in baseball scoring, from ARIB STD B24

 

Miscellaneous Technical

     Miscellaneous technical

U+23E8   ⏨   decimal exponent symbol So ON
* Algol-60 token for scientific notation literals

 

Miscellaneous Symbols

     Symbols for closed captioning from ARIB STD B24

U+269E   ⚞   three lines converging right So ON
aka someone speaking
U+269F   ⚟   three lines converging left So ON
aka background speaking

     Sports symbols

U+26BD   ⚽   soccer ball So ON
U+26BE   ⚾   baseball So ON

     Miscellaneous symbol from ARIB STD B24

U+26BF   ⚿   squared key So ON
aka parental lock

     Weather symbols from ARIB STD B24

U+26C4   ⛄   snowman without snow So ON
aka light snow
U+26C5   ⛅   sun behind cloud So ON
aka partly cloudy
U+26C6   ⛆   rain So ON
aka rainy weather
U+26C7   ⛇   black snowman So ON
aka heavy snow
U+26C8   ⛈   thunder cloud and rain So ON
aka thunderstorm

     Game symbols from ARIB STD B24

U+26C9   ⛉   turned white shogi piece So ON
U+26CA   ⛊   turned black shogi piece So ON
U+26CB   ⛋   white diamond in square So ON
ref U+233A   ⌺   APL functional symbol quad diamond (Miscellaneous Technical)

     Traffic signs from ARIB STD B24

U+26CC   ⛌   crossing lanes So ON
aka accident
ref U+292C   ⤬   falling diagonal crossing rising diagonal (Supplemental Arrows B)
U+26CD   ⛍   disabled car So ON
U+26CF   ⛏   pick So ON
aka under construction
U+26D0   ⛐   car sliding So ON
aka icy road
U+26D1   ⛑   helmet with white cross So ON
aka maintenance
U+26D2   ⛒   circled crossing lanes So ON
aka road closed
U+26D3   ⛓   chains So ON
aka tyre chains required
U+26D4   ⛔   no entry So ON
U+26D5   ⛕   alternate one way left way traffic So ON
* left side traffic
U+26D6   ⛖   black two way left way traffic So ON
* left side traffic
U+26D7   ⛗   white two way left way traffic So ON
* left side traffic
U+26D8   ⛘   black left lane merge So ON
* left side traffic
U+26D9   ⛙   white left lane merge So ON
* left side traffic
U+26DA   ⛚   drive slow sign So ON
U+26DB   ⛛   heavy white down pointing triangle So ON
aka drive slow
ref U+25BD   ▽   white down pointing triangle (Geometric Shapes)
U+26DC   ⛜   left closed entry So ON
U+26DD   ⛝   squared saltire So ON
aka closed entry
ref U+22A0   ⊠   squared times (Mathematical Operators)
U+26DE   ⛞   falling diagonal in white circle in black square So ON
aka closed to large vehicles
U+26DF   ⛟   black truck So ON
aka black lorry
aka closed to large vehicles, alternate
U+26E0   ⛠   restricted left entry 1 So ON
U+26E1   ⛡   restricted left entry 2 So ON

     Dictionary and map symbols from ARIB STD B24

U+26E3   ⛣   heavy circle with stroke and two dots above So ON
aka public office
U+26E8   ⛨   black cross on shield So ON
aka hospital
U+26E9   ⛩   shinto shrine So ON
aka torii
U+26EA   ⛪   church So ON
U+26EB   ⛫   castle So ON
U+26EC   ⛬   historic site So ON
U+26ED   ⛭   gear without hub So ON
aka factory
ref U+2699   ⚙   gear (Miscellaneous Symbols)
U+26EE   ⛮   gear with handles So ON
aka power plant, power substation
U+26EF   ⛯   map symbol for lighthouse So ON
U+26F0   ⛰   mountain So ON
U+26F1   ⛱   umbrella on ground So ON
aka bathing beach
U+26F2   ⛲   fountain So ON
aka park
U+26F3   ⛳   flag in hole So ON
aka golf course
U+26F4   ⛴   ferry So ON
aka ferry boat terminal
U+26F5   ⛵   sailboat So ON
aka marina or yacht harbour
U+26F6   ⛶   square four corners So ON
aka intersection
U+26F7   ⛷   skier So ON
aka ski resort
U+26F8   ⛸   ice skate So ON
aka ice skating rink
U+26F9   ⛹   person with ball So ON
aka track and field, gymnasium
U+26FA   ⛺   tent So ON
aka camping site
U+26FB   ⛻   japanese bank symbol So ON
U+26FC   ⛼   headstone graveyard symbol So ON
aka graveyard, memorial park, cemetery
U+26FD   ⛽   fuel pump So ON
aka petrol station, gas station
U+26FE   ⛾   cup on black square So ON
aka drive-in restaurant
U+26FF   ⛿   white flag with horizontal middle black stripe So ON
aka japanese self-defence force site

 

Dingbats

     Miscellaneous

U+2757   ❗   heavy exclamation mark symbol So ON
aka obstacles on the road, arib std b24

 

Miscellaneous Symbols and Arrows

     Traffic sign from ARIB STD B24

U+2B55   ⭕   heavy large circle So ON
aka basic symbol for speed limit
ref U+25EF   ◯   large circle (Geometric Shapes)

     Dictionary and map symbols from ARIB STD B24

U+2B56   ⭖   heavy oval with oval inside So ON
aka prefectural office
U+2B57   ⭗   heavy circle with circle inside So ON
aka municipal office
ref U+25CE   ◎   bullseye (Geometric Shapes)
U+2B58   ⭘   heavy circle So ON
aka town or village office
ref U+25CB   ○   white circle (Geometric Shapes)
U+2B59   ⭙   heavy circled saltire So ON
aka police station
ref U+2A02   ⨂   n ary circled times operator (Supplemental Mathematical Operators)

 

Latin Extended C

     Miscellaneous additions

U+2C70   Ɒ   Latin capital letter turned alpha Lu L
* lowercase is 0252

     Additions for Shona

U+2C7E   Ȿ   Latin capital letter S with swash tail Lu L
* lower case is 023F
U+2C7F   Ɀ   Latin capital letter Z with swash tail Lu L
* lower case is 0240

 

Coptic

     Cryptogrammic letters

U+2CEB   Ⳬ   Coptic capital letter cryptogrammic shei Lu L
U+2CEC   ⳬ   Coptic small letter cryptogrammic shei Ll L
U+2CED   Ⳮ   Coptic capital letter cryptogrammic gangia Lu L
U+2CEE   ⳮ   Coptic small letter cryptogrammic gangia Ll L

     Combining marks

U+2CEF   ⳯   Coptic combining ni above Mn NSM
* this mark is used in final position and extends above the following character (usually a space)
U+2CF0   ⳰   Coptic combining spiritus asper Mn NSM
ref U+0314   ̔   combining reversed comma above (Combining Diacritical Marks)
ref U+0485   ҅   combining Cyrillic dasia pneumata (Cyrillic)
U+2CF1   ⳱   Coptic combining spiritus lenis Mn NSM
ref U+0313   ̓   combining comma above (Combining Diacritical Marks)
ref U+0486   ҆   combining Cyrillic psili pneumata (Cyrillic)

 

Supplemental Punctuation

     Historic punctuation

U+2E31   ⸱   word separator middle dot Po ON
* used in Avestan, Samaritan, ...
ref U+00B7   ·   middle dot (Latin-1 Supplement)

 

Enclosed CJK Letters and Months

     Circled ideographs from ARIB STD B24

U+3244   ㉄   circled ideograph question So L
U+3245   ㉅   circled ideograph kindergarten So L
U+3246   ㉆   circled ideograph school So L
U+3247   ㉇   circled ideograph koto So L

     Circled numbers on black squares from ARIB STD B24

U+3248   ㉈   circled number ten on black square So L
aka speed limit 10 km/h
U+3249   ㉉   circled number twenty on black square So L
aka speed limit 20 km/h
U+324A   ㉊   circled number thirty on black square So L
aka speed limit 30 km/h
U+324B   ㉋   circled number forty on black square So L
aka speed limit 40 km/h
U+324C   ㉌   circled number fifty on black square So L
aka speed limit 50 km/h
U+324D   ㉍   circled number sixty on black square So L
aka speed limit 60 km/h
U+324E   ㉎   circled number seventy on black square So L
aka speed limit 70 km/h
U+324F   ㉏   circled number eighty on black square So L
aka speed limit 80 km/h

 

CJK Unified Ideographs

     

U+9FC4   鿄   CJK Ideograph 9FC4 Lo L
U+9FC5   鿅   CJK Ideograph 9FC5 Lo L
U+9FC6   鿆   CJK Ideograph 9FC6 Lo L
U+9FC7   鿇   CJK Ideograph 9FC7 Lo L
U+9FC8   鿈   CJK Ideograph 9FC8 Lo L
U+9FC9   鿉   CJK Ideograph 9FC9 Lo L
U+9FCA   鿊   CJK Ideograph 9FCA Lo L
U+9FCB   鿋   CJK Ideograph 9FCB Lo L

 

CJK Compatibility Ideographs

     ARIB compatibility ideographs

U+FA6B   恵   CJK compatibility ideograph fa6b Lo L
U+FA6C   𤋮   CJK compatibility ideograph fa6c Lo L
U+FA6D   舘   CJK compatibility ideograph fa6d Lo L

 

Phoenician

     Numbers

U+1091A   𐤚   phoenician number two No R
U+1091B   𐤛   phoenician number three No R

Altered Characters


In addition, 7 characters were altered in Bad Version

A total of 2 characters changed their General Category
2 characters changed their General Category from Letter, Lowercase to Letter, Modifier
 
A total of 5 characters changed their Bidirectional Category
5 characters changed their Bidirectional Category from Left To Right to Other Neutral

 

Superscripts and Subscripts


U+2071     superscript Latin small letter I had its General Category changed from Letter, Lowercase to Letter, Modifier
U+207F     superscript Latin small letter N had its General Category changed from Letter, Lowercase to Letter, Modifier

 

Mathematical Alphanumeric Symbols


U+1D6DB   𝛛   mathematical bold partial differential had its Bidirectional Category changed from Left To Right to Other Neutral
U+1D715   𝜕   mathematical italic partial differential had its Bidirectional Category changed from Left To Right to Other Neutral
U+1D74F   𝝏   mathematical bold italic partial differential had its Bidirectional Category changed from Left To Right to Other Neutral
U+1D789   𝞉   mathematical sans serif bold partial differential had its Bidirectional Category changed from Left To Right to Other Neutral
U+1D7C3   𝟃   mathematical sans serif bold italic partial differential had its Bidirectional Category changed from Left To Right to Other Neutral
http://unicode.org
Some prose may have been lifted verbatim from unicode.org,
as is permitted by their terms of use at http://www.unicode.org/copyright.html

Log in or register to write something here or to contact authors.