The
Unicode pipeline is always full of new characters, scripts, character properties and algorithms. See also
Unicode Technical Report for drafts and proposed drafts of upcoming changes.
Many scripts have had a range of character codes tentatively pre-allocated
for them, but remember that anything might change, if it is not yet part of a released version of the standard.
Here is the tentative future, as of December 2007.
Accepted
Scripts which have been formally accepted by
UTC or
WG2 for processing toward inclusion in the standard. Many of these will be part of
Unicode 5.1.
U+1A20 to U+1AAF Lanna (144)
U+1C00 to U+1C4F Lepcha (80)
U+1C50 to U+1C7F Ol Chiki (48)
U+1C80 to U+1CCF Meitei Mayek (80)
U+2DE0 to U+2DFF Cyr. Ext.-A (32)
U+A500 to U+A63F Vai (320)
U+A640 to U+A69F Cyrillic Extended-B (96)
U+A700 to U+A75F Mod. Tone (96)
U+A830 to U+A83F Ind№ (16)
U+A880 to U+A8DF Saurashtra (96)
U+A900 to U+A91F HangulA (32)
U+AA00 to U+AA5F Cham (96)
U+AA80 to U+AADF Tai Viet (96)
U+D7B0 to U+D7FF Hangul Jamo Extended-B (80)
U+10190 to U+101CF Ancient Symbols (64)
U+101D0 to U+101FF Phaistos Disc (48)
U+10280 to U+1029F Lycian (32)
U+102A0 to U+102DF Carian (64)
U+10840 to U+1085F Imp.Aramaic (32)
U+10920 to U+1093F Lydian (32)
U+10B00 to U+10B3F Avestan (64)
U+10B40 to U+10B5F Parthian (32)
U+10B60 to U+10B7F Insc. Phlv. (32)
U+13000 to U+133FF Egyptian Hieroglyphs (1024)
U+13400 to U+1342F Egyptian Hier. (48)
U+17000 to U+180FF Tangut Ideographs (4352)
U+18100 to U+1810F Tangut (16)
U+1F000 to U+1F02F Mahjong Tiles (48)
U+1F030 to U+1F09F Domino Tiles (112)
U+2A700 to U+2B6FF CJK Unified Ideographs Extension C (4096)
U+2B700 to U+2B77F CJK Unified Ideographs Ext. C (128)
Formally Proposed and Pre-allocated
Scripts for which proposals have been formally submitted to the
UTC or to
WG2.
U+0800 to U+083F Samaritan (64)
U+1B00 to U+1B3F Batak (64)
U+1CD0 to U+1CFF Vedic Extensions (48)
U+A8E0 to U+A8FF Deva. Ext. (32)
U+A920 to U+A99F Javanese (128)
U+AB00 to U+AB3F Varang Kshiti (64)
U+AB40 to U+AB6F Sorang Sng. (48)
U+10350 to U+1037F Old Permic (48)
U+10980 to U+109DF Meroitic (96)
U+10A60 to U+10A7F Balti (32)
U+10A80 to U+10A9F S.Arabian (32)
U+10AC0 to U+10AFF Manichaean (64)
U+10B80 to U+10B9F Psalt. Phlv. (32)
U+10BA0 to U+10BDF Book Pahlavi (64)
U+10C00 to U+10C3F Old Turkic (64)
U+10C40 to U+10C6F Old Hungarian (48)
U+10E60 to U+10E7F Rumi Symb. (32)
U+11000 to U+1104F Brahmi (80)
U+11080 to U+110CF Kaithi (80)
U+11100 to U+1113F Soyombo (64)
U+11180 to U+111DF Sharada (96)
U+12800 to U+12A7F Anatolian Hieroglyphs (640)
U+12B00 to U+12EFF Indus (1024)
U+13500 to U+146FF Egyptian Hieroglyphs Extended (4608)
U+16000 to U+1607F Tengwar (128)
U+16080 to U+160FF Cirth (128)
U+16200 to U+165FF Blissymbols (1024)
U+1B000 to U+1B2FF Nüshu (768)
Pre-allocated
These
code blocks are being considered by the
Unicode and
ISO 10646 committees, enough to pre allocate
a range of character codes for the code block, but no formal proposal
has been submitted.
U+0840 to U+085F Mandaic (32)
U+0860 to U+089F Arabic Extended-A (64)
U+AA60 to U+AA7F MyanmarA (32)
U+AB70 to U+ABCF Chakma (96)
U+10500 to U+1053F Old Udi (64)
U+10540 to U+1057F Elbasan (64)
U+10580 to U+1059F Büthakukye (32)
U+105A0 to U+105BF Iberian (32)
U+10600 to U+1077F Linear A (384)
U+10780 to U+107BF Cypro-Minoan (64)
U+10860 to U+1087F Palmyrene (32)
U+10880 to U+1089F Nabataean (32)
U+108A0 to U+108BF Numidian (32)
U+108C0 to U+108DF Hatran (32)
U+108E0 to U+108FF N.Arabic (32)
U+109E0 to U+109FF Elymaic (32)
U+10C80 to U+10CDF Uighur (96)
U+10D00 to U+10D6F Byblos (112)
U+10E00 to U+10E2F Yezidi (48)
U+10E80 to U+10EFF Persian Siyaq Numerals (128)
U+10F00 to U+10FFF Arabic Mathematical Alphabetic Symbols (256)
U+11140 to U+1117F Ahom (64)
U+11200 to U+1124F Tulu (80)
U+11280 to U+112DF Turkestani (96)
U+11300 to U+1137F Grantha (128)
U+11380 to U+113DF Siddham (96)
U+11400 to U+1143F Pyu (64)
U+11480 to U+114DF Maithili (96)
U+11500 to U+1155F Chalukya (Box-Headed) (96)
U+11580 to U+115DF Chola (96)
U+11600 to U+1165F Satavahana (96)
U+11680 to U+116DF Takri (96)
U+11700 to U+1175F Landa (96)
U+11780 to U+117DF Modi (96)
U+11800 to U+1185F Newari (96)
U+11880 to U+118BF Leke (64)
U+12480 to U+127FF Archaic Cuneiform Extensions (896)
U+12A80 to U+12DCF Rongorongo (848)
U+14700 to U+153FF Egyptian Hieroglyphs Extended-A (3328)
U+15400 to U+158FF Maya Hieroglyphs (1280)
U+15C00 to U+15FFF Aztec Pictograms (1024)
U+16600 to U+166FF Blissymbol Extensions (256)
U+16800 to U+169EF Old Bamum (496)
U+16A00 to U+16ABF Mende (192)
U+16B00 to U+16B2F Bassa (48)
U+16B80 to U+16BEF Woleai (112)
U+16C00 to U+16C2F Chinook (48)
U+16D00 to U+16DFF Shorthands (256)
U+16F00 to U+16FFF Pollard Phonetic (256)
U+18200 to U+187FF Jurchen Ideographs (1536)
U+19000 to U+1917F Khitan Small Script (384)
U+19180 to U+1A3FF Khitan Ideographs (4736)
U+1A800 to U+1A9FF Naxi Geba (512)
U+1AA10 to U+1AFFF Naxi Tomba (1520)
U+1C000 to U+1CA7F Micmac Hieroglyphs (2688)
U+1CE00 to U+1CFFF Proto-Elamite (512)
U+1D800 to U+1DBFF Sutton SignWriting (1024)
U+2B800 to U+2F7FF CJK Unified Ideographs Extension D (16384)
Not Pre-allocated
These scripts, for one or another reason, are not
given tentative pre-allocations.
Several categories are provided, to indicate the reasons why a script might not be suitable for pre-allocation.
Known scripts which have been investigated, but which are unified with existing encoded scripts.
Scripts which have been investigated and
rejected as
unsuitable for encoding.
Known scripts, with enough information, but insufficient reason to provide pre-allocation.
Known scripts, but insufficient information to do a decent job of rough pre-allocation, and/or insufficient to know whether a pre-allocation is warranted.
Things rumored to be scripts, but not clearly enough attested for us to even determine whether they are "known scripts".
http://unicode.org
Some prose may have been lifted verbatim from unicode.org,
as is permitted by their terms of use at http://www.unicode.org/copyright.html