Translation in Progress: State of Internationalization
The point of this post is to do a round up for the community, get on the same page with some of the developers, and find a path toward what to do for Drupal 6 core and beyond, as well as with contributed modules. Many thanks to Jose Reyero and Gabor Hojtsy for helping me get this piece to provide an accurate summary of internationalization’s main points and its direct contributions. Some words are paraphrased, others are directly quoted. You'll also see below how the Drupal group has really helped surface a lot of issues, viewpoints, and case studies.
Basically, the way I see it is that things won't all work right until at least 6.0, but interim contrib solutions could probably be done for 5.0., to do what will be made much smoother after Drupal 6. That said, Drupal 6 does not necessarily promise to be the end of the internationalization challenge, as the development schedule is relatively short. The four main issues being discussed in the community include language detection and negotiation, simple string/data translation, structured data translation, and localizable variables. These are discussed in more detail in the second half of this post.
First off, the internationalization discussion at groups.drupal.org/i18n is active, largely moderated by Gabor Hojtsy, and involves many different participants and 125 actual group members. This does not include some activity on drupal.org itself by way of issue threads with patches and proposed solutions to various internationalization challenges. Two developers actively participating and maintaining up-to-date contributed modules for internationalization handling include Jose Reyero (i18n module) and Roberto Gerola (localizer module, based somewhat on the i18n module itself). You can read more about Jose, a member of the Development Seed team, over here on the blog.
The Drupal group only consists of 16 posts but attached to these posts are a total of nearly 200 comments. This group is a great example of the success of groups.drupal.org and all the Drupal modules that go into forming this functionality. It’s also bringing many important internationalization and Drupal issues to the forefront just as Drupal 5.0 is about to reach a final release and people start looking for solutions, as well as development for 6.0 gaining steam.
While there are only a total of 16 posts in the group, some about very specific challenges in internationalization with Drupal, the comments cover a range of issues and perspectives. Gabor Hojtsy is writing his thesis at the Budapest University of Technology and Economics (Hungary) on internationalization and its web application, and as a longtime community member and developer he’s focusing a lot on Drupal, though he’s also looking at other platforms as the challenge naturally requires. The discussion, partly led by Gabor and with invaluable input from many different users and developers, is helping surface the challenges users and developers face when handling internationalization on a Drupal site.
As you can imagine, with nearly 200 comments, following all of the discussion can be pretty difficult. To help myself and my colleagues, I've started gardening my del.icio.us links in order to sort the issues based on the main concerns. This is a work in progress and is not complete, but the start of it can be found http://del.icio.us/DevelopmentSeed/ian%2Binternationalization.
As I see it, there are four main issues:
Language detection and negotiation
Simple string/data translation
Structured data translation (nodes, menus, taxonomy)
Localizable variables
Gabor has done a chart that visualizes three of these four issues (minus variables)
Language Detection and Negotiation
The language negotiation issue can be divided into two parts - interface language detection/decision, and content language detection/decision. This is Gabor's language, and you can see the breakdown in his chart here http://www.flickr.com/photos/gaborhojtsy/348865193/. Related to language negotiation is path handling, which deals with how to put the language in the URI (like domain.com/en/node/2, domain.com/q=node/2&locale=en, and en.domain.com/node/2), to store the information in the language a person is looking at a site. Although the major contributed modules all allow for some or all of these (and in one case nothing is stored in the URI which sounds like not the way to go), there is an argument that this should be handled in Drupal core, as well as how interface language and content language relate.
Simple String/Data Translation
By this I mean string handling for user-created ("dynamic") strings. When discussing internationalization and Drupal, there are two types of strings: user-created (often referred to as "dynamic") strings, and static strings. Static strings are those that are created in module code, like the "create content" link in the default navigation menu. An example of user-created strings are taxonomy terms created by users or field names created by users like "sub title" when making custom CCK types. Current solutions by contributed modules sometimes use their own way to store user-created strings, while others use the core locale module. It sounds like a general solution separate from locale or modifying locale to better support user-created strings is where the conversation is going. There are existing proposals and good analysis on how to solve the problem and why using the locale module itself may not be a good idea.
Directly quoting Gabor, he states that "This is a hot topic, since users cannot provide menu items or taxonomy term names in all languages of their site, because Drupal does not have any support for this (even Drupal 5). Additionally the new content type system emphasizes this problem, since it creates the Page and Story content types on install time, so names and descriptions of these types will be English (without further manual editing), regardless of the language used on the site. This results in many users thinking about this problem as a bug (unlike the other three i18n aspects), although this is "just" a missing feature of Drupal."
This is the case with Drupal core. Contributed modules, such as Jose's i18n module set, do allow for taxonomy term translation and some variable translation like the site name. Both Gabor and Jose point out there are weaknesses with handling some structured data in a simple string/data solution. Jose specifically states that although a simple string/data solution could be used for handling things like taxonomy, doing it the way it is done in i18n currently is much more powerful for taxonomy at least. Gabor's chart here illustrates the relation between simple data and structured data. One question now is whether the simple string/data handling and structured data issue both have their answer in a single piece of functionality/module.
Structured Data Translation
Structured data storage involves translating nodes, taxonomies, menus, and perhaps more. One major topic is whether or not to use one node to translate another node. The thread with the most comments on it (32 comments, to date) discusses whether content translations should be stored as nodes or not. The arguments boil down to how using nodes is more closely tied to the Drupal way, not using nodes is not really a choice for contributed module developers without any core patching, and about Drupal itself becoming less reliant on the node system.
Jose explains there is nothing yet that resembles a workable object module for the non node translation approach and talks a little more about that here http://groups.drupal.org/node/1790#comment-5134. As for taxonomies and menus, although you could translate terms and menu items with a simple string/data solution, taxonomies and menus have important structures like hierarchy and relationships that are important to consider. There are many simple strings in Drupal core and contributed modules that may or may not be better stored as structured data, such as custom CCK field labels, labels on views, profile field names, and more. This is illustrated in Gabor's chart as well as expressed by Jose.
Localizable Variables
Localizable variables, though seemingly similar to string translation, are different than strings, module independent, and with associated performance hits. Jose says, "While I think in the long term these mechanisms may eventually be implemented in some homogeneous way (cascading variables), I can't see it coming in the near future so we better provide some multilingual variable handling in the meantime." Jose also explains that multilingual variables, which are not always strings such as a site's logo with language in the logo art, could need to be translated as well.
Here’s a photoset on Flickr that illustrates some of the main issues in internationalization http://flickr.com/photos/developmentseed/sets/72157594469340793/.
While there are many smaller issues in internationalization, most of them are related to the four issues above. Translating menus, taxonomy, and blocks are directly related to simple and structured data and how a contributed or core module will assist with this, or a combination of both. The lack of support by core to handle the string translation issue has caused contributed modules to either use their own custom way (which could be part of a wider solution), or use functionality in core not originally designed for such purposes. The goal for Drupal 6 is to figure out what core can do to make contributed approaches work better while learning from the issues in 4.7 and challenges in 5.0.
There are so many opportunities purposely not discussed in this article about improving workflow and usability, which Jose Reyero is working on particularly with the i18n contributed module package. Support in core for the difficult and sprawling issues that no core support causes will allow for user interface and workflow niceties to at least get closer to an internationalization that is more "fluent" as we move toward Drupal 6.
This is a cross-post from (http://www.developmentseed.org/blog/)






Thanks for keeping us up to
Thanks for keeping us up to date!
A possible solution for strings translation
Hi.
I've implemented a solution for the simple strings/data translation, structured data translation and also
for localized variables.
You can read the details here : http://groups.drupal.org/node/1827#comment-5328
It is already implemented in my module, Localizer.
At now I'm using it for menu and taxonomy translation and in the near future also
for variables translation.
Here there are also some screenshots :
http://www.speedtech.it/drupal/localizermodule
of this system in action for menu and taxonomy.