International Enabling is making a piece of software perform correctly under different locales, keyboard layouts and languages.

Thre are many, many aspects to making a piece of software internationally enabled, but I shall keep this list quite simple.

Aspects of International Enabling:
  • Extended Characters

    English can use 7-bit ASCII for representing all the characters in the language. Most other languages, such as French, German and Spanish, don't have all the characters in their languages represented in 7-bit ASCII, but they do in 8-bit ASCII. Therefore, you cannot use a 7-bit ASCII number to represent characters in a string. You must check that your program handles extended characters correctly.

  • Locale Settings

    For cultural reasons, different areas use different formats for representing data such as dates and numbers. For example, the number 1.9 in an English locale might be represented as 1,9 elsewhere. Hopefully, the Operating System you are developing under will provide functions for locale conversion of data (since the OS usually has a record of what locale settings to use). If it does not, you might have to manually write the conversion functions themselves. Note: Programming languages are unaffected by locale settings. double d = 1.9 will be the same in all locales and language settings.

  • Multibyte data:

    Multibyte characters are used to store characters from Asian languages such as Japanese or Chinese. The first (lead) byte in the multibyte character has a value above 127, and therefore does to represent any valid character by themselves. When combined with a trail byte this is used to choose a character from the Asian font, which is then displayed. This can cause problems for edit controls, which when the backspace key is pressed delete one character at a time. If they delete the previous byte instead of the previous two, the text will become corrupted.

    Multibyte characters are usually entered using an IME (input method editor). There is one built into Windows, and XWindows can use XIME, which is very similar to it's Windows cousin.

    Lead/Trail Byte Table for Chinese and Japanese
    Language       Code Page    Lead Byte Rg.    Trail Byte Rg.
    ========       =========    =============    ==============
    Japanese       CP 932       0x81 - 0x9F      0x40 - 0xFC
                                0xE0 - 0xFC      (excl. 0x7F)
    Chinese        CP 950       0xA1 - 0xC6      0x40 - 0xFE
    (traditional)               0xC9 - 0xF9      (excl. 0x7F)
    Chinese        CP 936       0xA1 - 0xA9      0xA1 - oxFE
    (simplified)                0xB0 - 0xF7
    
  • Bi-Directional Languages:

    Some launguages, such as Hebrew and Arabic, are read from both left-right and right-left. Hebrew for example reads from right to left, but the numbers are read from left to right. If you are designing a graphical application and want to add support for bi-directional languages, then you will need to bear this in mind.

  • Sort Order

    Different language sort their characters differently. You will need to bear this in mind if you are implementing any sort functions on text.
As I stated above, this list is not exhaustive, but it contains most of the more important aspects of international enabling.

Log in or register to write something here or to contact authors.