[NGD Studios co-founder Nicolas Lamanna describes the challenges translators and editors face when managing all of their localization tool options, and presents a possible solution, in this #altdevblogaday-reprinted opinion piece.]
Localization, the old familiar concept that many development teams have to see face-to-face, sooner or later.
Whether you are a small indie developer or a big studio, if you want to reach a bigger audience, your game needs to be translated and localized. Of course, there are games that forego text for a more audiovisual experience (Flower for instance) and that's perfectly fine, but here we'll focus on the ones that do have text, even when it's only a small amount.
The last ten years I've been involved in localization tools development and talking to different translators and editors. When I think of that, a lot of options immediately come to mind: Excel sheets, Word Documents, .txt files, several custom applications, and even mail clients.
After a while, and since software development is undergoing a constant evolution, I found myself thinking, "Okay, if those are the only options, surely they can be reduced or managed to some extent. Everybody knows how to edit documents in those familiar applications. It really is not that big of a problem, right?"
Turns out, the problem not only didn't get any better, but it has worsened. Let's add o that list, more custom applications from localization companies, WordPress Plugins, Facebook translations (I'm not sure about Google+ but they probably have something, too), resx files, a giant list of custom XML schemas, you name it.
This madness has to stop.
Problematic
First of all, when one talks about localization, the common subject is text files. But anyone who has been involved in that process knows that it actually implies a whole umbrella of elements. Text with Unicode support is a given (even though there is a plethora of issues there), then you have images with text on them, videos, audio files, local legal requirements, and maybe content tailored for a local market -- one culture may think one symbol is cool while others may think it resembles death incarnate.
Sure, you could say, don't complicate things and just stamp an Excel file or a document there, put an incremental number scheme on the file name, and you are set. Whilst that could work for text, when you have images or audio involved, it just doesn't cut it. Every format has its quirks and ways of testing it, and it's not like your QA team speaks five major languages and can corroborate whether the content is correct or not.
So you have all these digital assets, you send it to translators and localizators, whether it's a big company or contractors. You get them back, you approve them for updating, editing takes place. Voila, somebody forgot about a word here or proof-reading there. You have to send it back to the translators, but now the deadline has to be pushed a little forth. Rinse, repeat.
Of course, there are already several localization initiatives (for instance, i18n, l10n, etc). However, it's important to have something shoehorned to our industry.
But let's be a bit more precise and summarize the problematic:
- Inconsistency: Most tools don't talk well to each other. They could reside on different platforms which you don't have access to. Conversion must be made at several levels -- lost in translation gets a whole new meaning here.
- Inefficient: Great overhead in time and resources. Files need to be synchronized across multiple departments and teams, in many cases.
- Follow-up nightmare: Text and localization need to be corrected all the time, be it proof-reading, typos, even cultural corrections. Not many tools are friendly enough for translators to be comfortable with them.
- Error prone: All these moving parts leads to a loss of quality in terms of localization. It's usually a part that gets underestimated and ends up with a lesser priority. The impact of a bad translation is not immediately obvious.
- Expensive: A complex problem that needs a complex solution, sounds about right. That sounds expensive, too. Either you hire a third-party company, or you have a full-grown localization team in-house. And if you are really small, well, you probably do it yourself but then the amount of content you can pour is limited since you have to worry about other things too.
- Unit of localization: A Symloc is a single atomic element that can be localized. Meaning it can be translated or modified and corresponds to any given language. It may contain the data directly or just a reference (in case of a video that weights a gigabyte).
- Metadata: Any Symloc can have important information related to the asset. It can be used as an aid for translators and content editors or it can be used by another program. Examples of these are preview information, notes, the size of an image or text, the length of an audio file, etc.
- Grouping: Symlocs can be grouped together to have different languages into one asset or to have related assets merged together.
- Support for every language out there: This one is a no-brainer, Unicode support from the ground up, since we don't have any size restrictions, no one is left behind.
- Play nice with other standards: There is a lot of research poured into localization and internationalization standards, possibly use those for specific things like currency, date and time, telephone numbers, and any other term that may be culturally different.
- XML as a foundation: XML is flexible enough to be used for any format. Official schemas can then be used to validate a proper localized document.
- A true standard: created as an open initiative, no company should control this standard to avoid conflicting interests.