The Thai
script is used to write Thai and other Southeast
Asian languages such as
Kuy,
Lavna and
Pali. It is a member of the
Indic family of scripts descended from
Brahmi. Thai extensions to the Brahmi
character set include
tone marks derived from
superscript digits. The Thai script lacks the
conjunct consonants and
independent vowels found in most Brahmi-derived scripts. Thai is written
left to right.
The Thai layout in Unicode is based on the Thai Industrial Standard 620-2529 and its updated version 620-2533.
In common with Indic scripts, each Thai letter is a consonant possessing an inherent vowel sound. Thai letters further feature inherent tones. The inherent vowel and tone can be modified with vowel signs and tone marks. Most Thai vowel signs are rendered by full letter sized in-line glyphs placed either before, after or around the glyph for the base consonant. When the vowel's glyph is before the consonant, it is encoded as a separate character before the consonant. This differs from all other Indic scripts, but is necessary to comply with the Thai Industrial Standard.
There are several punctuation marks particular to Thai :
U+0E4F ๏ Thai character fongman is the Thai bullet, used to mark items in lists or appearing at the beginning of a verse, sentence, paragraph or other textual segment.
U+0E46 ๆ Thai character maiyamok is used to mark repetition of preceding letters.
U+0E2F ฯ Thai character paiyannoi is used to indicate elision or abbreviation of letters. It is also used as a regular letter, such as in the Thai name for Bangkok. Paiyannoi is also used in combination (U+0E2F U+0E25 U+0E2F) to create a construct called paiyanyai which means et cetera and is comparable to
U+17D8 ៘ Khmer sign beyyal.
U+0E5A ๚ Thai character angkhankhu is used to mark the end of a long segment of text. It can be followed by
U+0E30 ะ Thai character sara a to mark even longer segments of text, such as at the end of a verse in poetry.
U+0E5B ๛ Thai character khomut marks the end of a chapter or document, where it always follows the angkhankhu + sara a combination.
The angkhankhu + sara a combination is closely related to
U+17D4 ។ Khmer sign khan
and
U+17D5 ៕ Khmer sign bariyoosan
which are themselves ultimately related to the Devanagari characters U+0964 । Devanagari danda
and
U+0965 ॥ Devanagari double danda.
Thai words are not separated by spaces, but spaces are introduces where Western typography might use a comma or period. To mark a word boundary (e.g. for line breaking) use
U+200B zero width space.
Unicode's
Thai code block reserves the
128 code points from U+0E00 to U+0E7F, of which
87 are currently assigned.
Sinhala <-- Thai --> Lao
All the characters in this code block were added in Unicode 1.1
Number of characters in each General Category :
Letter, Modifier Lm : 1
Letter, Other Lo : 56
Mark, Non-Spacing Mn : 16
Number, Decimal Digit Nd : 10
Punctuation, Other Po : 3
Symbol, Currency Sc : 1
Number of characters in each Bidirectional Category :
Left To Right L : 70
European Number Terminator ET : 1
Non Spacing Mark NSM : 16
The columns below should be interpreted as :
- The Unicode code for the character
- The character in question
- The Unicode name for the character
- The Unicode General Category for the character
- The Unicode Bidirectional Category for the character
If the characters below show up poorly, or not at all, see Unicode Support for possible solutions.
Thai
Based on TIS 620-2533
- U+0E01 ก Thai character ko kai Lo L
- U+0E02 ข Thai character kho khai Lo L
- U+0E03 ฃ Thai character kho khuat Lo L
- U+0E04 ค Thai character kho khwai Lo L
- U+0E05 ฅ Thai character kho khon Lo L
- U+0E06 ฆ Thai character kho rakhang Lo L
- U+0E07 ง Thai character ngo ngu Lo L
- U+0E08 จ Thai character cho chan Lo L
- U+0E09 ฉ Thai character cho ching Lo L
- U+0E0A ช Thai character cho chang Lo L
- U+0E0B ซ Thai character so so Lo L
- U+0E0C ฌ Thai character cho choe Lo L
- U+0E0D ญ Thai character yo ying Lo L
- U+0E0E ฎ Thai character do chada Lo L
- U+0E0F ฏ Thai character to patak Lo L
- U+0E10 ฐ Thai character tho than Lo L
- U+0E11 ฑ Thai character tho nangmontho Lo L
- U+0E12 ฒ Thai character tho phuthao Lo L
- U+0E13 ณ Thai character no nen Lo L
- U+0E14 ด Thai character do dek Lo L
- U+0E15 ต Thai character to tao Lo L
- U+0E16 ถ Thai character tho thung Lo L
- U+0E17 ท Thai character tho thahan Lo L
- U+0E18 ธ Thai character tho thong Lo L
- U+0E19 น Thai character no nu Lo L
- U+0E1A บ Thai character bo baimai Lo L
- U+0E1B ป Thai character po pla Lo L
- U+0E1C ผ Thai character pho phung Lo L
- U+0E1D ฝ Thai character fo fa Lo L
- U+0E1E พ Thai character pho phan Lo L
- U+0E1F ฟ Thai character fo fan Lo L
- U+0E20 ภ Thai character pho samphao Lo L
- U+0E21 ม Thai character mo ma Lo L
- U+0E22 ย Thai character yo yak Lo L
- U+0E23 ร Thai character ro rua Lo L
- U+0E24 ฤ Thai character ru Lo L
- * independent vowel letter used to write Sanskrit
- U+0E25 ล Thai character lo ling Lo L
- U+0E26 ฦ Thai character lu Lo L
- * independent vowel letter used to write Sanskrit
- U+0E27 ว Thai character wo waen Lo L
- U+0E28 ศ Thai character so sala Lo L
- U+0E29 ษ Thai character so rusi Lo L
- U+0E2A ส Thai character so sua Lo L
- U+0E2B ห Thai character ho hip Lo L
- U+0E2C ฬ Thai character lo chula Lo L
- U+0E2D อ Thai character o ang Lo L
- U+0E2E ฮ Thai character ho nokhuk Lo L
- aka ho nok huk
Sign
- U+0E2F ฯ Thai character paiyannoi Lo L
- aka paiyan noi
- * ellipsis, abbreviation
Vowels
- U+0E30 ะ Thai character sara a Lo L
- U+0E31 ั Thai character mai han akat Mn NSM
- U+0E32 า Thai character sara aa Lo L
- ref U+0E45 ๅ Thai character lakkhangyao (Thai)
- U+0E33 ำ Thai character sara am Lo L
- U+0E34 ิ Thai character sara i Mn NSM
- U+0E35 ี Thai character sara ii Mn NSM
- U+0E36 ึ Thai character sara ue Mn NSM
- U+0E37 ื Thai character sara uee Mn NSM
- aka sara uue
- U+0E38 ุ Thai character sara u Mn NSM
- U+0E39 ู Thai character sara uu Mn NSM
- U+0E3A ฺ Thai character phinthu Mn NSM
- * Pali virama
Currency symbol
- U+0E3F ฿ Thai currency symbol baht Sc ET
Vowels
- U+0E40 เ Thai character sara e Lo L
- U+0E41 แ Thai character sara ae Lo L
- U+0E42 โ Thai character sara o Lo L
- U+0E43 ใ Thai character sara ai maimuan Lo L
- aka sara ai mai muan
- U+0E44 ไ Thai character sara ai maimalai Lo L
- aka sara ai mai malai
- U+0E45 ๅ Thai character lakkhangyao Lo L
- aka lakkhang yao
- * special vowel length indication used with 0E24 or 0E26
- ref U+0E32 า Thai character sara aa (Thai)
Sign
- U+0E46 ๆ Thai character maiyamok Lm L
- aka mai yamok
- * repetition
Vowel
- U+0E47 ็ Thai character maitaikhu Mn NSM
- aka mai taikhu
Tone marks
- U+0E48 ่ Thai character mai ek Mn NSM
- U+0E49 ้ Thai character mai tho Mn NSM
- U+0E4A ๊ Thai character mai tri Mn NSM
- U+0E4B ๋ Thai character mai chattawa Mn NSM
Signs
- U+0E4C ์ Thai character thanthakhat Mn NSM
- * cancellation mark
- U+0E4D ํ Thai character nikhahit Mn NSM
- aka nikkhahit
- * final nasal
- U+0E4E ๎ Thai character yamakkan Mn NSM
- U+0E4F ๏ Thai character fongman Po L
- * used as a bullet
- ref U+17D9 ៙ Khmer sign phnaek muan (Khmer)
Digits
- U+0E50 ๐ Thai digit zero Nd L
- U+0E51 ๑ Thai digit one Nd L
- U+0E52 ๒ Thai digit two Nd L
- U+0E53 ๓ Thai digit three Nd L
- U+0E54 ๔ Thai digit four Nd L
- U+0E55 ๕ Thai digit five Nd L
- U+0E56 ๖ Thai digit six Nd L
- U+0E57 ๗ Thai digit seven Nd L
- U+0E58 ๘ Thai digit eight Nd L
- U+0E59 ๙ Thai digit nine Nd L
Signs
- U+0E5A ๚ Thai character angkhankhu Po L
- * used to mark end of long sections
- * used in combination with 0E30 to mark end of a verse
- U+0E5B ๛ Thai character khomut Po L
- * used to mark end of chapter or document
- ref U+17DA ៚ Khmer sign koomuut (Khmer)
http://unicode.org
Some prose may have been lifted verbatim from unicode.org,
as is permitted by their terms of use at http://www.unicode.org/copyright.html