Usability for Nerds/Software/Internationalization
Software packages often have different versions for different languages or features for adapting it to regional differences. The following features are often adapted to local settings:
- User interface language: The language for menu items, form fields, feedback, error messages, help text, manual, etc.
- Text direction: Languages using Latin letters have the text from left to right, while Arabic has right-to-left text.
- Character set: System using 8-bit characters (e.g. DOS) have different character sets to fit different language-specific letters.
- Date format: This can be Day-Month-Year, Year-Month-Day or Month-Day-Year depending on local customs.
- Currency format.
- Measurement units, e.g. centimeters versus inches.
- Shortcut keys (e.g. Ctrl+P = Print).
- Decimal mark: Some countries use a point '.' to mark the decimals of a number, and some countries use a comma ','. Both are widely used. The thousands separator, if any, is often the opposite of the decimal mark.
- List separator: When a list of numbers is used, the numbers may be separated by a comma, semicolon, space, tabulator or any other character. The list separator is often a comma if the decimal point is not a comma.
- Command language: Software systems may have a command line, embedded macros, scripts, user defined functions, embedded codes for formatting, numbering, cross references, etc. Such systems often use English language for these commands, functions, etc., but some systems use translations of these to the local language.
- Hardware may also be adapted to local standards, e.g. keyboard layout.
The advantage of internationalization and localization is obvious: It is easier to use for persons in different countries who speak different languages. However, there are also many problems and disadvantages with internationalization:
- It is quite expensive to develop, verify, maintain and support.
- Instructions and manuals are often poorly translated.
- Many technical terms are first defined in e.g. English, and may not have equivalents in other languages, or the translated terms are not commonly used or understood.
- The same person may use multiple computers and multiple software products with different language settings. This often causes confusion, frustration and errors.
- The world is globalized. People travel and exchange information, instructions, files and software across borders. Here, compatibility across borders is important.
- Many people use the web for information seeking, information sharing, and increasingly also for data processing. The information sites and services they use may not be internationalized, or people may prefer the English version which is typically better.
- When people have problems using a software program, they often prefer to search the web for solutions rather than to read a clumsy manual or use a mediocre help function. The solutions they find on the web are unlikely to cover national peculiarities.
- Many people study abroad. A teacher may have problems helping a student with different language settings on their computer.
- Data files are often exchanged across borders, or between systems or programs with different language settings. This causes compatibility problems e.g. when numbers in data files are written with different decimal marks and list separators.
As the new generation of IT-literate people are growing up in a globalized world, they get more and more used to sharing information and software across borders. College and university students are more likely to read textbooks in English than in their national language. The Internet is the number one source of information for many people, and the information they find is quite often in a language different from their own. Technical differences are gradually being weeded out as a consequence of this globalization. For example, the word billion means 109 in some countries and 1012 in other countries. This difference in meaning can no longer be maintained as international news reports often contain the word. In 1974, the British government changed the meaning of billion from 1012 to 109 as a consequence, and we can expect other countries to follow the same path.
For similar reasons, we can expect technical differences in software between countries to be weeded out and the need for internationalization or localization to decrease. Companies with an international focus often have their own standards which avoid national peculiarities in order to facilitate communication across borders. Other users may want to turn off national peculiarities for the sake of compatibility between software packages that have internationalization and other software packages that haven't. Many users prefer to use the English version of a software package, even when a version in their own language is available, because of all the problems mentioned above. Any software that has internationalization and localized settings should therefore have options to turn off national peculiarities.
Based on observed problems with internalization, we can make a list of suggestions for each type of feature:
- User interface language: This is the most common and most useful form of localization. The program may have an option for switching between different languages.
- Text direction: Obviously, this should always match the language.
- Character set: Use Unicode, possibly UTF-8 or UTF-16. Avoid national character sets.
- Date format: Use the ISO 8601 format: YYYY-MM-DD. This international standard has the advantage that text strings containing dates will be sorted correctly. Programs where users can enter a date should preferably have separate fields for year, month and day to avoid confusion. Programs that can output reports, e.g. spreadsheets, may have options for formatting dates according to user settings.
- Currency format. Programs may have options for user-defined settings.
- Measurement units. SI units should always be preferred. Options for using other units may be useful.
- Shortcut keys. These should be standardized and never localized. Shortcut keys are used mainly be heavy users for whom a particular key combination becomes second nature. Such users become frustrated when they meet a system with different shortcut keys than they are used to. Attempts to translate shortcut keys may clash with previously existing meanings of the same key combination. For example, the key combination Ctrl+S means save in most software programs in most countries, but Ctrl+S means underline in French, Italian and Spanish versions of Microsoft Office.
- Decimal mark. Users may prefer to use a point even in countries where a comma is the standard for the sake of compatibility with other software or other countries. The program should have an option for choosing between point and comma. Programs that can output reports for human reading, e.g. spreadsheets, may have options for choosing the decimal mark for each report. Data files intended mainly for machine reading should always use a point.
- List separator. These are used mainly in comma-separated values (csv) files. These files are human-readable, but are used mostly for exchanging data between different software systems as well as technical instruments. Despite the name, csv files may contain other separators than comma. This causes severe compatibility problems, e.g. in Microsoft Office programs that use localized settings and poorly functioning conversion features. Standardized file formats should be preferred and proper conversion features should be implemented where multiple formats are in use.
- Command language. Commands that are used in scripts, embedded commands, functions, etc. should be in English language only. It can be very difficult for users to guess what the translation of a command name is. Users are likely to seek help on the Internet and exchange code and instructions across borders. Therefore, translations of commands cause many problems and solve few. Files containing embedded commands or scripts must be compatible across borders.