The Unicode pipeline is always full of new characters, scripts, character properties and algorithms. See also Unicode Technical Report for drafts and proposed drafts of upcoming changes.

Many scripts have had a range of character codes tentatively pre-allocated for them, but remember that anything might change, if it is not yet part of a released version of the standard.

Here is the tentative future, as of December 2007.


Scripts which have been formally accepted by UTC or WG2 for processing toward inclusion in the standard. Many of these will be part of Unicode 5.1.

U+1A20 to U+1AAF   Lanna   (144)
U+1C00 to U+1C4F   Lepcha   (80)
U+1C50 to U+1C7F   Ol Chiki   (48)
U+1C80 to U+1CCF   Meitei Mayek   (80)
U+2DE0 to U+2DFF   Cyr. Ext.-A   (32)
U+A500 to U+A63F   Vai   (320)
U+A640 to U+A69F   Cyrillic Extended-B   (96)
U+A700 to U+A75F   Mod. Tone   (96)
U+A830 to U+A83F   Ind№   (16)
U+A880 to U+A8DF   Saurashtra   (96)
U+A900 to U+A91F   HangulA   (32)
U+AA00 to U+AA5F   Cham   (96)
U+AA80 to U+AADF   Tai Viet   (96)
U+D7B0 to U+D7FF   Hangul Jamo Extended-B   (80)
U+10190 to U+101CF   Ancient Symbols   (64)
U+101D0 to U+101FF   Phaistos Disc   (48)
U+10280 to U+1029F   Lycian   (32)
U+102A0 to U+102DF   Carian   (64)
U+10840 to U+1085F   Imp.Aramaic   (32)
U+10920 to U+1093F   Lydian   (32)
U+10B00 to U+10B3F   Avestan   (64)
U+10B40 to U+10B5F   Parthian   (32)
U+10B60 to U+10B7F   Insc. Phlv.   (32)
U+13000 to U+133FF   Egyptian Hieroglyphs   (1024)
U+13400 to U+1342F   Egyptian Hier.   (48)
U+17000 to U+180FF   Tangut Ideographs   (4352)
U+18100 to U+1810F   Tangut   (16)
U+1F000 to U+1F02F   Mahjong Tiles   (48)
U+1F030 to U+1F09F   Domino Tiles   (112)
U+2A700 to U+2B6FF   CJK Unified Ideographs Extension C   (4096)
U+2B700 to U+2B77F   CJK Unified Ideographs Ext. C   (128)

Formally Proposed and Pre-allocated

Scripts for which proposals have been formally submitted to the UTC or to WG2.

U+0800 to U+083F   Samaritan   (64)
U+1B00 to U+1B3F   Batak   (64)
U+1CD0 to U+1CFF   Vedic Extensions   (48)
U+A8E0 to U+A8FF   Deva. Ext.   (32)
U+A920 to U+A99F   Javanese   (128)
U+AB00 to U+AB3F   Varang Kshiti   (64)
U+AB40 to U+AB6F   Sorang Sng.   (48)
U+10350 to U+1037F   Old Permic   (48)
U+10980 to U+109DF   Meroitic   (96)
U+10A60 to U+10A7F   Balti   (32)
U+10A80 to U+10A9F   S.Arabian   (32)
U+10AC0 to U+10AFF   Manichaean   (64)
U+10B80 to U+10B9F   Psalt. Phlv.   (32)
U+10BA0 to U+10BDF   Book Pahlavi   (64)
U+10C00 to U+10C3F   Old Turkic   (64)
U+10C40 to U+10C6F   Old Hungarian   (48)
U+10E60 to U+10E7F   Rumi Symb.   (32)
U+11000 to U+1104F   Brahmi   (80)
U+11080 to U+110CF   Kaithi   (80)
U+11100 to U+1113F   Soyombo   (64)
U+11180 to U+111DF   Sharada   (96)
U+12800 to U+12A7F   Anatolian Hieroglyphs   (640)
U+12B00 to U+12EFF   Indus   (1024)
U+13500 to U+146FF   Egyptian Hieroglyphs Extended   (4608)
U+16000 to U+1607F   Tengwar   (128)
U+16080 to U+160FF   Cirth   (128)
U+16200 to U+165FF   Blissymbols   (1024)
U+1B000 to U+1B2FF   Nüshu   (768)


These code blocks are being considered by the Unicode and ISO 10646 committees, enough to pre allocate a range of character codes for the code block, but no formal proposal has been submitted.

U+0840 to U+085F   Mandaic   (32)
U+0860 to U+089F   Arabic Extended-A   (64)
U+AA60 to U+AA7F   MyanmarA   (32)
U+AB70 to U+ABCF   Chakma   (96)
U+10500 to U+1053F   Old Udi   (64)
U+10540 to U+1057F   Elbasan   (64)
U+10580 to U+1059F   Büthakukye   (32)
U+105A0 to U+105BF   Iberian   (32)
U+10600 to U+1077F   Linear A   (384)
U+10780 to U+107BF   Cypro-Minoan   (64)
U+10860 to U+1087F   Palmyrene   (32)
U+10880 to U+1089F   Nabataean   (32)
U+108A0 to U+108BF   Numidian   (32)
U+108C0 to U+108DF   Hatran   (32)
U+108E0 to U+108FF   N.Arabic   (32)
U+109E0 to U+109FF   Elymaic   (32)
U+10C80 to U+10CDF   Uighur   (96)
U+10D00 to U+10D6F   Byblos   (112)
U+10E00 to U+10E2F   Yezidi   (48)
U+10E80 to U+10EFF   Persian Siyaq Numerals   (128)
U+10F00 to U+10FFF   Arabic Mathematical Alphabetic Symbols   (256)
U+11140 to U+1117F   Ahom   (64)
U+11200 to U+1124F   Tulu   (80)
U+11280 to U+112DF   Turkestani   (96)
U+11300 to U+1137F   Grantha   (128)
U+11380 to U+113DF   Siddham   (96)
U+11400 to U+1143F   Pyu   (64)
U+11480 to U+114DF   Maithili   (96)
U+11500 to U+1155F   Chalukya (Box-Headed)   (96)
U+11580 to U+115DF   Chola   (96)
U+11600 to U+1165F   Satavahana   (96)
U+11680 to U+116DF   Takri   (96)
U+11700 to U+1175F   Landa   (96)
U+11780 to U+117DF   Modi   (96)
U+11800 to U+1185F   Newari   (96)
U+11880 to U+118BF   Leke   (64)
U+12480 to U+127FF   Archaic Cuneiform Extensions   (896)
U+12A80 to U+12DCF   Rongorongo   (848)
U+14700 to U+153FF   Egyptian Hieroglyphs Extended-A   (3328)
U+15400 to U+158FF   Maya Hieroglyphs   (1280)
U+15C00 to U+15FFF   Aztec Pictograms   (1024)
U+16600 to U+166FF   Blissymbol Extensions   (256)
U+16800 to U+169EF   Old Bamum   (496)
U+16A00 to U+16ABF   Mende   (192)
U+16B00 to U+16B2F   Bassa   (48)
U+16B80 to U+16BEF   Woleai   (112)
U+16C00 to U+16C2F   Chinook   (48)
U+16D00 to U+16DFF   Shorthands   (256)
U+16F00 to U+16FFF   Pollard Phonetic   (256)
U+18200 to U+187FF   Jurchen Ideographs   (1536)
U+19000 to U+1917F   Khitan Small Script   (384)
U+19180 to U+1A3FF   Khitan Ideographs   (4736)
U+1A800 to U+1A9FF   Naxi Geba   (512)
U+1AA10 to U+1AFFF   Naxi Tomba   (1520)
U+1C000 to U+1CA7F   Micmac Hieroglyphs   (2688)
U+1CE00 to U+1CFFF   Proto-Elamite   (512)
U+1D800 to U+1DBFF   Sutton SignWriting   (1024)
U+2B800 to U+2F7FF   CJK Unified Ideographs Extension D   (16384)

Not Pre-allocated

These scripts, for one or another reason, are not given tentative pre-allocations. Several categories are provided, to indicate the reasons why a script might not be suitable for pre-allocation.

Known scripts which have been investigated, but which are unified with existing encoded scripts.

Scripts which have been investigated and rejected as unsuitable for encoding. Known scripts, with enough information, but insufficient reason to provide pre-allocation. Known scripts, but insufficient information to do a decent job of rough pre-allocation, and/or insufficient to know whether a pre-allocation is warranted. Things rumored to be scripts, but not clearly enough attested for us to even determine whether they are "known scripts".
Some prose may have been lifted verbatim from,
as is permitted by their terms of use at

Log in or register to write something here or to contact authors.