Internationalization (usually shortened i18n) is a clear requirement for any successful game and application. Especially when players are allowed to interact (chat) with each other, it is essential that they do it in their mother tongue. While such a requirement is relatively easy for some western languages, others like East Asian ones require the use of Input method editors (IME) that translate multiple keystrokes in final glyphs displayed on the screen.
In this post I’ll describe the basics of an IME integration in a game and some of the challenges that it poses. Recently we added full IME support to Coherent UI and most of the pain points were experienced first-hand.
Although most modern font rendering libraries have no issues showing Unicode content, generating it is another story. On most operating systems it relies on IME functionality. The input method detects the currently selected language and as the user types some keys (some support mouse input too), it automatically proposes a valid character for that combination. For instance the Pinyin input method relies on the user entering the pinyin of a Chinese character to receive a list of compatible final characters that otherwise would be impossible to write.
Some of the components during an IME composition
While the user types characters an “IME composition” is started – that means that the combination might produce different outputs and it is not committed to the text field until the user accepts it (usually by pressing ‘space’). The composition can also be discarded and restarted. In the screenshot above the “Composition string” is not final but shows an intermediate representation (that’s why it’s underlined). Part of the input has already been translated to Chinese characters while some of it remains to be defined.
The possible Chinese characters in our example that match the current Latin characters are given in the “Candidate list”. The user can select in it, either though a number, or with the mouse. PgUp/PgDn can scroll the candidate list to show more options.
Most IME methods are fairly complex pieces of software, so implementing an ad-hoc one in your application, although possible, is not something I’d recommend. Usually we rely on the OS to facilitate the use of the currently selected IME.
In this post I’ll assume that we use the Chinese Simplified Pinyin input method. Other languages like Japanese and Korean have some differences.
How to communicate and show the various IME-related data is platform-specific and more or less well documented in the API references. All relevant systems (Windows, Mac OS X, Linux) have default implementations for handling IME input – they can show the composition window, the candidate list and submit the text to the application. If you are creating a standard window with text fields you can safely use the system implementation. The default code works because the stock controls have already been written so that they interact properly with the input method.
What we want to do however is to implement IME in an application like a game that:
- usually renders fullscreen
- does not use any of the default OS input controls
Imagine an OpenGL game where you have to input the player name. You could still try to rely on the OS to draw and show the IME-related windows but the result will be very awkward. First your IME composition string and candidate list will appear where your window has it’s origin (the OS has no way of knowing where your text field is on-screen). Fullscreen applications on Windows at least will struggle showing the IME windows and an extremely noticeable and unpleasant z-fighting will begin between the application and the IME windows. Finally if the user is playing the game we don’t want him start typing IME characters, so the OS should somehow know which key events to ignore.
We need to accomplish a list of tasks to have a good, reliable IME implementation:
- Draw as much IME-related information yourself (candidate list, composition window etc.) as possible. This has the added benefit that you can style it in any way you want.
- Notify the OS when the game is in “typing” mode – the user is writing something in a text field.
- Notify the OS when the user has cancelled the current composition by un-focusing the text field without committing the text.
- Accept notifications by the OS when the composition has been committed or cancelled and update the UI accordingly.
- Read the OS hints about the notification string and display them. For instance you could underline the current on-going composition as a hint to the user.
Drawing the IME-related windows shouldn’t be a problem for any capable UI library. On Windows you must also pay attention when the candidate list changes – when the user pushes PgUp/PgDn during a composition, the list might change and a Windows message is received. Also when the numbers 1-9 are pushed, a candidate is selected and the text committed. On most systems you can also select the candidate by clicking it with the mouse. Users are accustomed to this and you should provide the functionality too.
Your UI library must support a way to tell you three things:
- Is a text input field on focus now – so that we can enable IME on the OS level and start listening to it’s events.
- Where is the caret – so that we can position the candidate list under it, as users are accustomed to.
- The user has changed the focus to something without text input capabilities – so that you can tell the OS to cancel the composition.
Additionally the library should have some notion of “composition string” – that is a temporary string marked visually in some way, that should be thrown away as soon as the user commits the composition and replaced with the final characters. Upon composition cancel, it should just be deleted.
These requirements appear simple enough but might require substantial coding effort. For reference you could see the implementation of the IME(CustomUI) sample in the DirectX SDK – just their CDXUTIMEEditBox class stands at 1000 lines of code – almost all of it IME-related.
Alas the OS interaction could also be somewhat tricky – especially if you need to support all Desktop platforms. The way things are done on Windows are radically different from Linux and Mac OS X. Although the verbosity of the Windows API in the IME-related stuff might be off-putting in the beginning, it is by far the most sane.
In conclusion good IME support is mostly a matter of how flexible your UI library is. The OS plumbing is tricky but manageable. The UI-related requirements however can become quite difficult for simple libraries.