Developing Translation Services for a Global Audience

Translation-insight-banner-2.jpg
 

Building a Content Management System, (CMS), with full translation support is a challenging, but also rewarding because you know your development work will reach the largest possible audience, i.e. the entire world.  At Level 11, we’ve had the opportunities to build full end-to-end solutions for challenging translation service applications.

Many different CMSs offer various Internationalization and Localization systems with their related file standards, but we have found a few do everything well. In this article, we will show you what to look for so your system can integrate well with a Translation Service Provider.

We choose Django as our web framework to build our CMS because it provides many powerful features, has a long history of open source support, and has excellent community resources available if you need help.

(Please Note: We did not use DjangoCMS, we used vanilla Django2.x)

This diagram represents a fully-deployed CMS instance developed by Level 11 and outlines the level of complexity of an end-to-end integrated CMS and translation management service system. For this discussion we will focus on the translation management system.

alt text

What do i18n, l10n and g11n stand for? 

These are the abbreviations for Internationalization (i18n), Localization (l10n), and Globalization (g11n), respectively, where the letters represent the first and last letter of the word, and the number represents the amount of characters in-between them, e.g. internationalization has 18 characters between the first and last letters.

“Internationalization is the design and development of a product, application or document content that enables easy localization for target audiences that vary in culture, region, or language. Localization refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market.” -W3C

…and Globalization is the combined  process of both internationalization and localization.

We found g11n to be more than just the technical specifics of making the system work in different languages, it is a broad-based development of solutions for meeting the needs of the whole world. Another important step when working with companies who provide translation services is to perform a rigorous review of the the technical steps needed for integration. Each one has different processes and requirements and there are nuances you will want to uncover before deployment to a production environment.

Let’s focus on the translation system

alt text

This expands your system in multiple ways:

  • Makes the administrative user interface (UI) in the CMS translated to all selected languages (if needed);

  • Translates the content for blogs, webpages, app pages, and text in images to all languages chosen to be translated;

  • Helps to handle text that reads left-to-right, right-to-left, and other reading direction orientation concerns.

What does our Globalization pipeline look like?

Consider what the content-authoring UI looks like in the system you use most. It is important to consider which parts of the system are static and which are dynamic. Static parts include descriptive/helper text that are hard-coded in the UI to help the users of the system, such as Title field for adding the title to a blog.  The dynamic parts are what the blog post author enters into the input fields. Both the static and dynamic parts here are potential targets for translation services; however, for different reasons.

The static parts, like the labels in the UI for Title, Description, Content, etc., need to be translated if you intend to have content entered by users, in multiple languages.

The dynamic parts, such as the title and all other content input, need to be translated if the audience for this content includes consumers who speak different languages.

For both of these globalization goals, you can utilize i18n techniques. In Django, we used two different applications to form our i18n/l10n pipeline, Rosetta and Vinaigrette. Rosetta offers the framework and UI for Globalization of the admin-side of the system. Vinaigrette expands that to allow the content to be translated, as well.

The i18n process transforms each hardcoded static label into a variable that is marked to be translated. Then, l10n provides the utilities to view what is to be translated, and localize it by finishing the translation in the destination language.

Each Web Framework or CMS system will have its own libraries, but one of the universal take-aways is most translation systems will utilize some standard file formats to facilitate the translation process.

Choosing a standard file format

One of the leading formats is XML-based, *.xliff. We chose to use the standard file formats (*.po and *.mo) because of their compatibility with our systems and their ubiquitous support in the translation service industry—the two most-supported formats by translation services providers.

Some companies will translate any file type that you throw at them, so long as you coordinate how you’d like them to handle them.  For example, HTML files could be translated between certain tags, such as image files with layers for text, although this approach is often offered at a premium cost. But often, if things can use *.po, *.mo, *.xliff, or any other standard translation formats, a more fluid digital workflow can be established.

Perhaps with a blog that requires only one language, you can publish something and be done with it without the need for translations services.  But when you work with a translation service provider, you need to be intentional and precise about your publishing and moderation workflows because each translation iteration comes at a cost.

In contrast, automated translation systems are available, but when the content is very sensitive or public, you don’t want to make avoidable mistakes that can have adverse impacts on your brand and your customers, such as a mis-translation.

In one case study, the company slogan "Come alive with Pepsi” was showcased on a huge billboard, but unfortunately, did not translate as it was intended. In Japanese, the automated translation inadvertently translated the slogan to “Pepsi brings your ancestors back from the dead” …whoops! This is a classic, real world example of what can happen without careful moderation (https://www.ve.com/blog/11-brand-slogan-translation-fails). 

Working with a team of qualified translation service specialists is highly recommended in order to avoid these high-visibility translation mishaps—you may end up paying more in the long-run if you don’t. At Level 11, our workflow (below) makes this process manageable:

alt text

Most editorial workflows offered by CMSs offer role assignments with specified levels of permissions, such as the ability edit and author content, or to officially publish content.  Once an article is “marked for approval” it enters the to-be-translated queue. Based on the workflows established with a translation service provider, the queue is often set at a threshold that will trigger a batch upload to the translation service provide. 

We utilize Django Signals to automate the transmittal of these steps. REST APIs on both our end and the translation service provider’s end facilitate all of these steps. Additionally, real people on both ends are available to address any questions or issues that arise; however, in the absence of issues, the process is turn-key.

At Level 11 we understand the importance and impacts of delivering complete Globalization capabilities for new systems, and retrofitting legacy systems to meet this need.  With the technical expertise to build, deploy, and service full stack translation management systems in Java, Python, and Node-based systems.