The
Unicode standard supplies 24 different variants on the
hyphen or
dash.
Because of its prevalence in legacy encodings,
U+002D - hyphen minus is the most common of the dash characters used to represent a hyphen. It has ambiguous semantic value and is rendered with an average width.
U+2010 ‐ hyphen
represents the hyphen as found in words such as left-to-right. It is rendered with a narrow width.
U+2011 ‑ non breaking hyphen
is present for compatibility with existing standards. It has the same semantic value as U+2010 hyphen, but should not be broken across lines.
U+2012 ‒ figure dash
also exists for compatibility. It has the same ambiguous semantics as U+002D hyphen minus, but has the same width as a digit (much like U+2007 figure space from space).
U+2013 – en dash
is used to indicate a range of values, such as 1973-1984. It should be distinguished from
U+2212 − minus sign
which is an arithmetic operator; however, typographers have typically used en dash for typesetting the minus sign.
U+2014 — em dash
is used to make a break--like this--in the flow of a sentence. It is commonly approximated with a double-hyphen.
U+2015 ― horizontal bar
is used to introduce quoted text in some typographic styles.
For a description of the line-breaking properties of dashes and hyphens, see Unicode Technical Report #14 Line Breaking Properties.
Note that tilde is sometimes called swing dash, and horizontal bar is sometimes called quotation dash.
The
24 characters in this category were added between Unicode versions 1.1 and 3.2
The columns below should be interpreted as :
- The Unicode code for the character
- The character in question
- The Unicode name for the character
- The Unicode General Category for the character
- The Unicode version when this character was added
- The HTML entity if any, instead of using &#xUUUU;
- (The SGML entities if any)
ASCII
U+002D - hyphen minus Pd 1.1 (‐)
U+007E ~ tilde Sm 1.1
ISO 8859-1 aka Latin-1
U+00AD soft hyphen Pd 1.1 ­ (­)
miscellaneous phonetic modifiers
U+02D7 ˗ modifier letter minus sign Sk 1.1
punctuation
U+058A ֊ Armenian hyphen Pd 3.0
punctuation
U+1806 ᠆ Mongolian todo soft hyphen Pd 3.0
formatting characters
U+2010 ‐ hyphen Pd 1.1 (‐)
U+2011 ‑ non breaking hyphen Pd 1.1
U+2012 ‒ figure dash Pd 1.1
U+2013 – en dash Pd 1.1 – (–)
U+2014 — em dash Pd 1.1 — (—)
general punctuation
U+2027 ‧ hyphenation point Po 1.1
general punctuation
U+2043 ⁃ hyphen bullet Po 1.1 (⁃)
OCR
U+2448 ⑈ ocr dash So 1.1
general punctuation
U+2052 ⁒ commercial minus sign Sm 3.2
mathematical operators
U+2212 − minus sign Sm 1.1 − (−)
CJK symbols and punctuation
U+301C 〜 wave dash Pd 1.1
other CJK symbols
U+3030 〰 wavy dash Pd 1.1
Katakana punctuation
U+30A0 ゠ Katakana Hiragana double hyphen Pd 3.2
glyphs for vertical varients
U+FE31 ︱ presentation form for vertical em dash Pd 1.1
U+FE32 ︲ presentation form for vertical en dash Pd 1.1
small form variants
U+FE58 ﹘ small em dash Pd 1.1
U+FE63 ﹣ small hyphen minus Pd 1.1
fullwidth ascii variants
U+FF0D - fullwidth hyphen minus Pd 1.1
http://unicode.org