uim/Introduction

< Uim

Uim is an input method framework. It provides various input methods constructed on a library and a set of tools to convert strings entered in a particular character subset into another subset according to predefined rules defined in a input method.

Typing Hangul (Korean) in Gaim.
Typing Hiragana (Japanese) in Gaim. Uim opens up a menu for user selection of homophones.

A common use for uim is to convert keyboard input of Latin characters (such as those used in English) into Chinese, Japanese, Korean or Vietnamese characters. For Chinese and Japanese, an input sequence that can correspond to multiple character strings will open up a list of candidates to select the desired strings.

Design

edit

Generally, an input method will accept keyboard input and change (either manually or by user interaction) the input string into a different one which it then passes on to the application. Because applications' input method implementation and character sets' characteristics differ greatly, any input method framework will face scalability problems when extending its support. To minimize these, uim has a highly modularized design.

The core library interfaces with loadable software modules that implement a particular natural language (e.g. Chinese or Korean) or input method-specific functionalities (e.g. keyboard-map switching). These modules (one or more) form particular input methods (such as the Pinyin input method for Chinese). Uim comes bundled with a number of these, but others are developed as separate projects (e.g. Anthy, Mana and PRIME).

In contrast to the client/server model which other input method frameworks are primarily based on, uim implements a library-level interface with applications through application-specific modules called bridges. This layout was chosen mainly for simplicity, and may be changed in the future.

The core library of uim is primarily written in C to ensure a stable ABI (Application Binary Interface) since maintaining one in C++ is sometimes difficult. Because C is very primitive to work with directly, uim has an embedded scheme interpreter which allows for more productive development. (Some internal portions of uim - the xim and scim bridges - are written in C++.)

Productivity for input method developers to facilitate high-quality input methods is a high priority for uim. Though it is stable and fully usable, there are currently less efforts aimed at desktop environment usability which is being prioritized in other input method frameworks.

Supported platforms and applications

edit

Uim is developed on Linux but portable to other *NIX operating systems and is known to also run on Linux Zaurus and Mac OS X.

Any application can use uim as long as there exists a bridge for it to connect to the uim library. X applications can use uim-xim and console applications uim-fep. In addition to these, GTK+, Qt and Emacs have their own bridges; gtk+-immodule, qt-immodule and uim-el, respectively.

Comparison to other input method software

edit

There are several other input methods available. Most cater to a specific language, but uim, SCIM, and IIIMF (written by the person who made XIM) offer a modular structure for easy extensibility (such software suites are usually called as input method frameworks). The main difference between these lies in SCIM and IIIMF's client/server structure, while uim remains a library-level direct connection with applications via the bridges.

Language-specific input methods

edit
Chinese
The only seemingly still actively developed IM is Fcitx (Free Chinese Input Toy for X). Others are minichinput (no development since 2003-07-17) and xcin (apparently no development since 10 Feb 2005).
Japanese
Anthy, PRIME, canna and kimera seem to be the main players (all but the last one are available as conversion engines under uim). Lastly, sumibi is worth mentioning for its online implementation of an input method.
Korean
Nabi is widely used. There are also imhangul for GTK+ and qimhangul for Qt apps.
Vietnamese
X-unikey seems to be the only one.

Current conversion engine implementations

edit

Uim comes with a few input methods but others that are compatible with the scheme interpreter may be installed separately. The latter have links to their official websites in the list below.

Chinese Features
New Pinyin (Simplified)
Pinyin (Unicode)
Pinyin (Traditional)
Japanese
Anthy User customizable dictionary. Optional かな and AZIK keyboard mapping modes.
Mana Uses a hidden Markov model for kanji conversion.
PRIME A predictive input system, offering suggestions for partially typed words based on the user's input history.
SKK No syntax analysis.
T-Code
TUT-Code
Korean
Byeoru Supports full Korean character set and Hangul/Hanja conversion
Hangul (2-beol)
Hangul (3-beol)
Hangul (Romaja)
Vietnamese
VIQR
Miscellaneous
uim-m17nlib Supports various languages
International Phonetic Alphabet
edit

These projects are not part of uim, but may be of interest to its users.

IM indicators

edit

Language binding

edit

Handwriting input pad

edit
  • uim-tomoe-gtk. A GUI front end of Tomoe handwriting recognition algorithm.

Bridges

edit