Last Wednesday I had a Skype conversation with Angie Byron (webchick), Francesco Placella (plach), Gábor Hojtsy and Jose Reyero to discuss how we are going to implement multilingual functionality in the Drupal 8 configuration system. For a little background, the configuration management system will be using XML files to store configuration. These files will be loaded into an 'active store' (the database by default, but pluggable) which will act as the source for configuration at all times. An extensive documentation of this is forthcoming, but for the purposes of this summary this should be enough information.
My original thought had been that we would store all the languages in one file. For instance
<site_name lang="en">I am awesome!</site_name>
<site_name lang="se">Jag är grymt!</site_name>
After some discussion it became apparent that this probably isn't going to work. Configuration usually has plenty of language independent pieces. The idea was that if you mess up one of these files, you can restore from the default set of configuration provided by the module. Since this will typically be an English-only file (when shipped with a module or theme), you would lose translations and have to re-merge everything all over again. Instead it seems to make sense that every file is language-specific, and contains only the configuration information that has actually been translated. So for instance you could have
site_information.en.xml (full set of configuration, shipped with Drupal)
site_information.se.xml (partially translated configuration, created install time)
Now say that your language is Swedish (.se). When files are loaded into the active store, the Swedish will be read and for any missing information the English version will be used. My understanding is that this is similar to how t() and the entity/field translation system works today. This information will then be the canonical set of configuration the site will run with. For configuration created by users on the fly (not shipped with modules or Drupal), the original version of the configuration will be the canonical version, and it can be in any language. Fallback will always happen to the original version of the configuration like entity/field translation.
Another thing that will need to be implemented is the ability to translate strings to an arbitrary language. This will be implemented as an additional parameter in the configuration API when you get data from the config. By default it will use the site's current language (as extracted from the context system), however you will also have the option to pass in a specific language. In this case the system will read the configuration out of the appropriate file, or, if it doesn't exist there, out of the active store.
This covers, if not everything we need, at least a good portion and enough to start implementing it into the system.
Over the weekend I also had a discussion with David Strauss about internationalization, and he pointed to
as an example of how Java uses a naming convention to get ever-more specific internationalization options from files. At the same time, Michael Favia found
which describes how Android uses the same system plus potentially additional context information. These could be useful references for building our system.
While this does cover websites in a single arbitrary language, it does not deal with the issues surrounding websites in multiple languages that can be switched arbitrarily. This still needs to be sorted out and I am sure we will be having followup discussions in the near future.