Translations structure in Drupal.org CVS and/or a web interface

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
Gábor Hojtsy's picture

So one part of my Google Summer of Code project is coming along nicely. Although I have posted a summary to my mentors recently, and we started to discuss some issues, they seem to be overwhelmed with work, so maybe opening up and widening the discussion is appropriate here. I had a hard time deciding whether post here or on the relevant issue (http://drupal.org/node/105986 is connected to the topic at hand), but I figured I will post a note there to this post, as this gets too long and looks into lots of directions to fit into that issue (not to make that issue a monster discussion).

  1. What's in Drupal 6 already?

I worked a lot to get automatic interface translation imports into Drupal 6. Thankfully Yched contributed the batch API, which made it a lot easier to import multiple interface translations in successive HTTP requests. Then Jakub Suchy helped a lot with testing, so we have automatic interface translation import in Drupal core for modules, install profiles and themes. The development version searches for these in the 'po' subdirectory of the module (or install profile or theme) at hand, but Konstantin Käfer suggested that we should call it 'translations', so that will be the final name.

TODO:
- implement an update path (I have good, yet untalked of ideas for this, but it would be offtopic here)
- define a place to put a site specific PO file to, so we can import it when doing a cleanup
- also working on in-place translation enabling Drupal! see: http://drupal.hu/english/node/18

  1. What needs to be changed with Drupal.org?

A lot. If you read up on the issue (http://drupal.org/node/105986), you will see that having translations separated completely from projects (modules, themes, install profiles) is a good thing, as we can control permissions more granularly, maintain releases with more control, get instant updates on CVS commits, make it possible to file issues against individual translations and so on. As it stands now, we will keep the "language rooted CVS tree", so that the language code will serve as a key for all the translations.

translations/
..de/
..hu/
..it/
..templates/
..tr/

'drupal-pot' should definitely be changed to 'templates', otherwise noone would understand it. So we have a root with templates and translations. Now what we have now there are a list of files for the Drupal core translation. Moving all project translations here, we would need a more granular structure:

translations/
..hu/
....drupal/
....modules/
....themes/
....profiles/
..templates/
....drupal/
....modules/
....themes/
....profiles/

And then you think it is going to be easy, so we can just copy over the core templates to 'templates/drupal' and copy over the core translations to 'hu/drupal'. Well, I don't think so.

'Unfortunately' Drupal 6 has this cool new feature, so it can import smaller files for only the modules you actually use. That is a cool performance improvement as well as a usability one. If you have untranslated strings, you know those are from the modules you use, not some unrelated stuff which was just imported because all the translations were merged together. We can try to flatten that structure, and it is even possible for core files to be sorted out into the complex folder structure of modules and themes with their individual translations, but we would not do that for contributed modules for sure. But those can also offer a module set with subfolders and a hierarchical structure within. So what we need:

translations/
..hu/
....drupal/
......modules/
........aggregator/
..........aggregator-module.po
........system/
..........general.po
....modules/
......ecommerce/
........address/
..........address-module.po
..templates/
....drupal/
......modules/
........aggregator/
..........aggregator-module.pot
........system/
..........general.pot
....modules/
......ecommerce/
........address/
..........address-module.pot

You get the idea. This gets complex! We need to mirror the whole hierarchy of the projects we support translations for, so that we can effectively package the translations to the right places in the output (and also avoid clashing file names). Note that I did not have hu.po files in all the folders there. It would not make sense! I could have multiple files in the same directory, so I need to distinguish by the name of what is being translated not with the language code.

This is all possible to joggle around in CVS, so we can move the existing files around and keep history for existing files still. The bigger questions are how to support the translation workflow later on. There are two fundamental problems: (a) creating translation templates in considered hard, project developers don't do it (b) maintaining translations in CVS / with gettext is considered hard.

A: automated translation template generation

We really need to generate translation templates automatically. As far as I see, it could even be possible to do it in an oncommit hook, but if that is deemed too resource intensive, we can do it with cron very often, maintaining a FIFO list to organize project branches/tags to generate new templates for. The tricky part here is that we would need to create branches and tags automatically as we discover the need for them, as well as the templates themselfs. So we would need a dedicated CVS user for this script, which is accepted to modify files and add/remove branches/tags on the repository. This does not sound easy. (The other alternative is to let the developers do all this themselfs, but you bet they would not, the translation files are very far away from their work in this setting).

B: maintaining a translation is hard

Such is life. It seems that Bruno Massa took the helm and started on what was part of my SoC project (the last point which said 'if time permits'). Look: http://drupal.org/project/lt_server and http://drupal.org/project/live_translation He did not even consider supporting branches and tags yet! There is no support for those! He is also not trying to work with existing stuff, copied and modified the potx module for himself, admitted he has no idea how plural formulas work and so on.

Anyway, if a completely web based approach could work, we would not need any of the above file based trickery, would still be able to generate files for distribution in the packages (rather then importing them live which is IMHO not suitable for any serious site, does not work when you are offline and so on). BUT! And there is of course a but! We would need to reimplement branching and tagging as well as permission controls for editing translations in Drupal itself. It would not be possible to check out a copy of the translations from CVS, but templates would still need to be generated, see point A.

Why Bruno is going fast forward is because he copies existing code instead of playing nice with it and adapting to his needs, (he even came up with a custom XML distribution format for translations instead of reusing the existing PO parser code) AND he does not deal with template generation, branching, permissions and packaging.

Which road to go?

So where we are really? I had three main objectives for the summer: (a) make Drupal 6 handle translations better, which is nicely shaping, (b) reorganize the CVS repository and projects to support project management better, (c) look into building a web based translation editor if (a) and (b) is already done. Unfortunately (b) and (c) have conflicting requirements. Now translators don't deal with project nodes, release nodes and branching/tagging. Point (b) tosses that job on them in the hopes that it gives them more features and control. But it gives them a lot more project management overhead and even steeper learning curve, more tools to learn to master. Point (c) on the other hand requires rewriting most of the features we wish to get from CVS in Drupal. So two theoretical options:

Option one: hide CVS tasks from translators

We would need a trusted CVS user again, or need to be able to log in the translators from a web interface through to the CVS repository, so web based simple options can be provided to branch and tag stuff when needed. Branches and tags should be coordinated with the core or contributed project in question in the web app. We need to core around a lot of CVS options and commands here, so we can revert branches if need be and so on.

Option two: go all web based

Instead of matching CVS branches and tags, we have the project and release nodes to match translations to. If we don't have translations in CVS, we don't need the CVS mangling, but need to relate the actual translations to the nodes on drupal.org. We would throw out lots of CVS power on the window, but gain more translators.

Tough questions! Help me by providing tips, ideas, criticism!

Comments

Web vs. CVS

Boris Mann's picture

Instead of matching CVS branches and tags, we have the project and release nodes to match translations to. If we don't have translations in CVS, we don't need the CVS mangling, but need to relate the actual translations to the nodes on drupal.org. We would throw out lots of CVS power on the window, but gain more translators.

The last section is the magic part -- power of CVS vs. lots of translators. Although, theoretically, you could periodically "build" translations and check them into CVS / make them available as source docs for "serious" sites.

I think centralized web translation (my idea was to have it at t.drupal.org) is VERY interesting. What if communities of interest could even store phrases, etc.?

Web Translation

brmassa's picture

Gárbor,

Some thoughts:

  1. Translations dont need to be totally linked to tag/branch: mymodule-5-x-1-1 should be an update to mymodule-5-x-1-0. No duplicating, but updating. But should not be the same as mymodule-4-7-x-1-1. Duplicanting.
  2. Live Translation Server only tracks Drupal 5 modules. Its a matter of creating a new colunm on the original-strings table (drupalversion) and start to track 4-7 and 6. But its capable to track 5.0 tags. If a module is updated, the server knows it. (It uses the Update Status module structure)
  3. It already does translation permissions. There are 3 levels: able to define the final translation, able to suggest translation and able only to look. I used OG to structure this.
  4. I believe on Web translations (allied to PO files importing) because its easier to users colaborate (if they dont know a thing about Gettext) and update (since the server points which strings are yet untranslated).
  5. I really dont like the CVS approach coz it lacks transparency to users and its big files/drupal-permissions mess
  6. I created a custom XML because:
    • Drupal 5/4.7 are already there. im not thinking exclusively on D6. i need a solution to everyone. If theres a solution that uses the existing structure, its welcome!
    • i needed some feature on the communication file that doesnt have on PO: creation time. OK, its HAS creation time, so see bellow...
  7. I dont like much a distribution file by default for a reason: updated transalations. What if the user download a totally incomplete or mistranslated file? how the system will alert him? The live approach garantee that an updated string on server is a updated string on every user. It will going to be important when we facilitate the process and thousand translations will appear suddenly. (thats why i needed a creation time)
  8. My Live Translation beta server, hosted on http://drupal.titanatlas.com, tracked 20611 translatable strings on more than 200 modules and themes for D5!!! 20k strings!
  9. The module consider, intentionally, "Submit" from Ecommerce module diffenrent from "Submit" on CCK, but we could think about considering all the same (since drupal itself does): its a waste of time translate 100 "Submit"s.
  10. The module aims the 3 great problems pointed:
    1. Strings extraction/Template generation
    2. Translation process itself
    3. Distribution/user reaching
  11. The original intention on LT module is give to all locale developers a CVS access to freely contribute since i dont have much time to dedicate on this.
  12. One last thing: I didnt know you were actually doing this on SoC. I thought you only proposed to mentor/do this.

regards,

massa

branches, updates

Gábor Hojtsy's picture

mymodule-5-x-1-1 should be an update to mymodule-5-x-1-0

Maybe so. CVS allows you (and project maintainers do use it) to tag a revision as Drupal 5 beta but not having a Drupal 5 branch. Project module does this for example. There could also be a Drupal 5.x.2 version for modules as well as a Drupal 5.x.3 and so on. Those can be significantly different. Many modules use this to introduce major architectural changes while keeping the same Drupal core compatibility. Unfortunately it is not possible to do this cleanly without thinking about CVS branches and tags, so it is not a matter of adding more columns or you could add an infinite number of columns...

I believe on Web translations (allied to PO files importing) because its easier to users colaborate (if they dont know a thing about Gettext) and update (since the server points which strings are yet untranslated).

It is not a matter of the format, but a matter of the tool. Millions of people use Microsoft Word, although they don't know how the format works. But they have the tool to use it. You could have as well copied the PO handling code, while playing nice and helping refactoring the code for Drupal 6 to make better use of it. Now you said you are not interested in refactoring PO handling in Drupal 6, but invented a custom, proprietary format which is used by no one else on the planet.

What if the user download a totally incomplete or mistranslated file?

What if the automated update mistranslates my site. You know if I have a community site, members will not tolerate if some button text changes over the night, so the site becomes unusable. Drupal 5 has an update_status module which let the users know when a module has an update available, so users get the updated translations and/or code. This is pushing to be included in Drupal 6. That would be one possibility to use for packaged translation updates, while maintaining control. Anyway, the bigger problems are evolving around creating a translation system which can play nice with the existing projects, takes the project branches and tags into account, even if not using CVS. Then whether we have snapshots of the translations to files or live updates or both is really not the complex part of the problem.

My Live Translation beta server, hosted on http://drupal.titanatlas.com, tracked 20611 translatable strings!!! 20k!

That is not much yet. Drupal core translation templates contain around two thousand strings, so considering how big contrib is, 20k strings is not much.

its a waste of time translate 100 "Submit"s.

Well, that would be one of the selling points of a web based translation system. Unfortunately it also complicates things a lot. The Hungarian translation team changed their minds between releases on how "story" translated for example. Now if we store translations for different releases separately, that is not of a big deal, but if we store strings across releases, we would need to go in and fix all legacy translations which contained "story" but not present in the new version, so we have our older version translation updated too. Then a string update on the older site would shock users of course with the terminology changes (automated updates are not that cool here). By separating translations to "branches" (be it in the DB, or otherwise) this can be stepped over, but new translations need to copy the old values when starting a new branch.

11. The original intention on LT module is to all locale developers have a CVS access to freely contribute since i dont have time.

You don't have time at all, as you accomplished what the company set out for you and you consider the project done, or you have limited time? As I have said, this module suite needs serious rewrites and rethinking. Whether I can pick it up depends on how would we define my Summer of Code project direction in light of the above and whether you accept me stepping in with drastic changes.

Live Translation module, currently

brmassa's picture

Gárbor,

about tags/branches and updating: i planned this module to be database and speed efficient. cloning the strings are not database efficient but if you are ok about this on the real database (hosted by drupal.org), its up to you.

about XML: i wrote the XML format and forked the POTX and LOCALE modules to reach my objects faster: a functional translation module. PO files, in my point of view, are only the way, not the objective itself. if it works, fine. But it will need exports and imports anyway. Making the module communicates with LT by Gettext format is a plus. If someone is familiar enough to write a patch, i would commit right way.

about 20k strings. well, the module automatically scans ALL D5 modules and themes and the logs reports that scanned all of them. Drupal has about 2100 strings, Ecommerce, CCK and Views has hundreds each but the rest has only a few strings. 20k are all strings. (considering not cloning on each tag)

about submits: im really afraid to create a confusing tool when you isolate things too much. since drupal doesnt see differences on core's "Submit" and ecommerce "Sunmit", why maintain it different things to translate? If someone on generec_module translate their "Submit" and "Sbumit" (imagine its a translation), will affect all Submits on Drupal site. Its much harded to create a consistent translatation towards all modules and much harder to maintain when separate things.

about open the CVS: consider it done. Altought i dont have time to maintain this on a daily basis, i want to discuss some things. i will also answer your email...

generelly speaking: i really dont mind much about the details (PO file, tags/branches) i really seek a tool easy to maintain (on the translation team point of view) and a breeze to use (on the international site maintainer point of view). the current manual PO importing and inclusion on each module just sucks!

regards,

massa

PS: im not sure if you tried Live Translation and LT Server. go to drupal.titanatlas.com and download LT. Its not perfect and has a lot of flaws, but generally it works fine: translators can help the community easely and users dont worry much about anything. Like the Ubuntu linux spirit: "it just works".

(re)positioning my summer of code project mail

brmassa's picture

Gárbor,

You said you dont like to be told to: "reload cron.php a dozen times to get translation for your enabled modules". Live Translation updates ALL modules on each cron. And has a manual update option. So asking users to download a PO file of each enabled module and import them and running the LT for the very first time has absolutelly the same effect. The different user has a lot more confort. Users can also avoid the "hourly change" by not selecting the "update strings on each cron" checkbox.

An additional feature im planning is to update strings when a new module is installed, so it, basicly, substitutes completly the "import local PO files when intalling a new module" feature. You dont need pack translations with modules anymore! if the module was created/updated on November but the Hungarian translation was only done by December, hungarian users will get the latest translation! smaller module packages, better user experience.

While i believe the main feature is the incremental updates (on demand) and PO exporting is the secondary one (useful on offline cases, which are much more rare), you are more likely to provide a PO file to each module and, as a bonus, a automatic update. If, IF, the user know that he/she will install a "international" drupal site offline, he/she can so download the Gettext files. I think its more logic. Once we adopt Gettext format as the main communication between LT server-client, create links to translations from a given module is trivial.

You said on the email that Drupal core is probably the only "module" that will not come with translations. Again, i disagree. Installation is probably the most confusing and delicate moment for new Drupal users. Providing as much translations on the box is vital for a good understanding/experience. drupal-br.org receives dozen issues every month about instalation just because they dont know a bit what is going on. Other open source projects im experienced with, like vTiger CRM and Ubuntu linux, adopt this technique. A solution to this is create a earlier step on installation that scan a directory on drupal for po files, so users can (1) download drupal and unpack, (2)download a translation and unpack and then (3) start the instalation.

regards,

massa

big updates

Gábor Hojtsy's picture

Live Translation updates ALL modules on each cron. ... So asking users to download a PO file of each enabled module and import them and running the LT for the very first time has absolutelly the same effect.

Well, if I add five languages to my site and have ten modules enabled, my next cron run will import 50 "translation files" if I understand this right. The cron will simply break in the process, leaving the update half done. Drupal 6 includes batch importing, so you don't hit the PHP timeout, and import smaller PO files in different HTTP requests. It is not even possible to import just one language translation of just the default enabled modules in one go, in one HTTP request without PHP timing out.

We are not asking people to download PO files for each enabled module, we ship PO files with modules. Whenever a user downloads a module, he finds the latest PO file in the package.

Once we adopt Gettext format as the main communication between LT server-client, create links to translations from a given module is trivial.

I am glad you think so. I did look through your code this weekend, and noticed you manage everything on the project level. Projects like ecommerce or Drupal core have various submodules, which are either enabled or not. Drupal 6 includes good performance improvements to only import translations for the enabled modules. Now your solution did not distinguish between the submodules in the Drupal or ecommerce projects, but imports all strings for all submodules if I understand that right. The internal structure of projects should really be reflected on the translation server, if we need to support module suites like Drupal core and ecommerce (and we do need to).

Installation is probably the most confusing and delicate moment for new Drupal users. Providing as much translations on the box is vital for a good understanding/experience. ...... Other open source projects im experienced with, like vTiger CRM and Ubuntu linux, adopt this technique.

Yes, this is exactly what I am saying. We should have as much in the box, as possible. Translations packaged into modules! Even if there is a network problem, or drupal.org is down for whatever reason, I should be able to install what I downloaded, not depending on some external server. Yes, Ubuntu linux ships with translation packages too, even including them on the CD.

A solution to this is create a earlier step on installation that scan a directory on drupal for po files, so users can (1) download drupal and unpack, (2)download a translation and unpack and then (3) start the instalation.

Seems like it is really time now to familiarize yourself with how Drupal 6 works. Grab a tarball (http://drupal.org/node/97368) and watch yourself! It immediately offers you to do the installation in some other language, helping you to get the translation package. This is so much "into the face", that I fear English natives will not like it... Then the full installation can continue in that language and all the enabled modules get their translations imported in the language used to install automatically. Then if you add a language or enable a module or theme, the corresponding translations get imported automatically. Here is a (now very outdated) video, which illustrates where did it start: http://hojtsy.hu/drop/DrupalInstaller.avi But you should really look at Drupal 6-dev, it works a lot nicer.

We worked really hard to make the user experience a lot better, and waiting for external services when you enable a module or add language, which easily leads you to a timeout or importing huge chunks of strings at once which break in the middle of the process were exactly the things we worked to avoid.

initial comments

dww's picture

First, sorry I haven't been able to give your posts enough thought and replies yet, it's not due to lack of interest or concern, just lack of time. :(

Next, a disclaimer: I'm an expert in revision control and project management, but not translations. At some level, those of you who directly deal with these problems regularly should decide what's best. I'm just here to offer my advice on the parts of the thing I know something about. ;)

And now, my comments...

1) On the CVS vs. web question, that's really up to those of you who belong to and/or lead the translation teams. Personally, I'd probably try to make the CVS stuff easier for people and keep using that, instead of having to re-implement lots of CVS's functionality and power. But, it sounds like you're leaning against that, and would rather do everything via the web, and figure out how to integrate it with the contrib code in CVS and d.o project/release nodes. If you think that's best, you have my blessings, and I'll do whatever I can to help you solve the web/CVS/release node integration issues. No matter what, I think we need the notion of releases, branches, and tags for translations for each contrib and for core. We don't necessarily have to use the exact same terminology as CVS, but we need that basic functionality. I don't know, but it seems like diff between releases would also be really handy, which you get for free with CVS, but would have to reimplement if you wanted it for the web interface, but that's probably not important enough to spend time on initially -- you can always add it later (or perhaps even use the existing diff.module and hook_diff() for this?).

2) To me, automatic importing of updated translations from a web site seems like a bad idea. I'm with Gabor on this. Once I get a given translation of the interface on the site, I want to consciously decide when/if to change it. Integration with the update_status functionality would be great (see below), but I'd really hate it if the interface might change via cron.php without an admin's intervention.

3) I really do not like the idea of packaging translations with contrib module code. One of the major motivations for my proposal to split contrib translations into their own projects is the following scenario:

  • Contrib author makes changes to their source, adds some new features, introduces new t() strings into their code
  • Contrib author wants to make a new release, so they tag the files in CVS, and create the release node.
  • Translator(s) have no way to update the translations between the time the author changed the t() strings and their translation files get included in the packaged release.

There's no good solution to this problem, except decoupling the translation packages from the code packages. Tagging the module code for a release should mark a fixed set of t() strings that correspond with a (by all means, autogenerated) .pot file representing the interface to translate, but it must not continue to include the translations themselves.

4) I believe that install profiles could be the solution to this packaging dilema. In this case, module maintainers are free to write their code, fix or extend the t() strings for the english default interface, and tag/release their code at will. No coordination is required with the translators. Then, language teams can translate at their leisure, and when they're ready, they can make official releases of the various contrib translations they care about. Then, each language team can maintain an install profile for core, whatever contribs they've translated, and all the translations for their language, in a single profile. My medium/short-term goal for the packaging scripts (and there's already $$ being raised for this) is so that install profile maintainers will specify exact versions of everything their profile includes, and it will all be packaged into the tarball for a specific release of that profile. So, if the Hungarian team deals with core, views, CCK, eCommerce, panels, update_status, diff, and a few other modules, they could provide a "Hungarian Drupal" install profile that includes the latest stable versions of all of these modules that have up-to-date translations. As a d.o project, this install profile could also have branches in CVS, so it could even track the new feature branches (e.g. Views/Panels 5.x-2.*, etc) on a separate branch.

5) For multi-lingual sites, and/or sites that want to add new languages to existing sites (not re-starting with the installer) or sites that want to use another installer for the basic functionality, then add as many of the languages as possible, we could perhaps introduce the notion of a downloadable "Language Pack" or something for a given language, where it just includes all the translations provided by a given language, not a full install profile.

Or, once a translation team decides it's got a good translation of the fixed interface for a given contrib module/version, it tags that translation (either in CVS or via the web, doesn't matter). Then, the d.o packaging scripts will package up this exact version of the translation as a separate tarball on d.o. Then comes the interesting release node integration. The project_release.module could add a new tab on release nodes, called something like "Translations", or a new download table on the "body" of the release node. In various places in the d.o project UI where you see a "Download" link, there could also be a "Translations" link if there are any translations of that particular release (or even any translations of any releases from the same branch or something), which brings you to this table of translations. So, when you're looking at the release node for views 5.x-1.5, you can see a table of all the languages that have translated that particular version. Maybe there could be links to older versions from the same module/branch if a given language exists for an older release, but hasn't specifically been tagged for 5.x-1.5 yet (e.g. there's a views 5.x-1.4 release for Hungarian, but not yet 5.x-1.5, so in the translations table, there's a section for "Older translations" that points to 5.x-1.4 for Hungarian).

6) Using the same technology as the update_status module (the .xml files of history of available releases, etc), we could build some kind of tool you could run on a site (either as a CLI tool, or perhaps part of the web interface itself) that will go out and tell you what languages are available for the particular versions of the contribs you actually have installed. If it was CLI, it could even automatically download them for you. If it was the web, it might make sense as a "Translations" or "Languages" tab on the update status report page, with subtabs for each language or something? This could also handle the case of a new release of a translation of a specific version of a contrib interface (e.g. a bad translation that got released early, then people decide to correct words in the translation, and re-release for the same version of the interface of the contrib). And, of course, it'd nicely handle the case where you're running views 5.x-1.5, the last official Hungarian views release was 5.x-1.4, so you're running that, but then the 5.x-1.5 translation comes out...

7) If you go with the web and not CVS, there's going to be a problem with translating the dev snapshot releases. :( By definition, these are moving targets, the t() strings are going to keep changing, etc. So, it's not clear how the web version is going to handle this, but it's clearly going to need some careful thought, probably all the way down into the schema itself.

I hope that's enough for you to chew on for a while. ;) I'll make an effort to be more available for these questions and discussions in the next few weeks.

Cheers,
-Derek

mostly agree

brmassa's picture

derek,

i mostly agree. just few comments about your comments (!):
2* (about automatic update) see item 6
3* i agree about split the module file and translation file. i believe we should deal translations the same way we deal comments of a node: linking the parent to its children. I go further: since its super human job deal translations of EACH module on EACH release, it should link translations to each module version (5-x, 4-7-x, 4-6-x) or, more precisely, to a branch. ecommerce-5-x-1-1 would be scanned again and new strings added to ecommerce-5-x "template", created on ecommerce-5-x-1-0.
5* It would be great to provide links to translations on the project/module page. One problem (IF the translation server being hosted on a subdomain/different server): how to integrate with projects on drupal.org?
6* I created the automatic release to deal with bad translations and partial translations (people usually help on small strings and leave blank those big "help page" strings). Im developing a interface just like status_update page telling which modules has new/updated translations. But the updating process, however, wont change: unlike php code, translation strings can be received as a "diff" (new or updated strings). So its basicly the same as updating on cron. but admins now are more aware.
7* im against translate dev versions too. (sorry to those modules that are never released, like project* ;-) one exception: drupal itself.

regards,

massa

dww's picture

The t() interface might not change at all between foo 5.x-1.3 and 5.x-1.4. That's why the translation download page should "degrade" to presenting the translation of the 5.x-1.3 interface if that's the last available translation version, even if the 5.x-1.4 code is now available. (In fact, something could even automatically notice that the .pot files are the same, and then just re-tag the 5.x-1.3 translation as the official 5.x-1.4 translation, if we wanted.)

However, the t() interface between 5.x-1.3 and 5.x-1.4 might change radically. In that case, it'll be essential for translators and users to be able to distinguish the two.

Therefore, translation versions being simply linked to branches is a terrible idea. We need to be specific, and then make it easy to generalize when possible. If we're general, we'd have no way to be specific when we need it.

Re: d.o integration:

If we do go with a web-based interface for doing, storing, and managing the translations, and we host that on a subdomain (translations.drupal.org, whatever), we can overcome the issues about how to integrate that with project and release nodes on d.o itself in numerous ways:
a) The download links already point to a different domain (ftp.osuosl.org), already.
b) We could potentially setup translations.drupal.org to share some project-related tables with the main drupal.org DB.
c) We could have t.d.o look at the .xml files to find out about release nodes, just like update_status does.
d) We could have t.d.o directly inspect CVS to find out about everything it needs to know.
...

Finally, please understand you're going to get almost no support for automatic translation updates, certainly not from me, not from Gabor, and almost certainly not Gerhard either. Therefore, it's not going to happen. ;) So, save your energy to argue about something else...

integration

Gábor Hojtsy's picture

1) Great! The plan is to relate string translations to (project and) release nodes, so we can generate PO files for any given release of any project. By doing the relation on the string level and not using CVS, we can share a lot of the translations (most modules use a good number of common strings). As I have written before, the plan is to have one project node per language, the relations of translation strings to release nodes would not be a node-node relation.

2) Agreed!

3) Well, unfortunately updating release packages only to update translations does not seem to be a good idea. Although I think it could work well for the users, the uncontollably updating translations in the packages make the packages moving targets again, which the whole "new" release system works hard to avoid, so well, this is not going to happen.

5a) I am not sure it is a good idea to package all translations of all modules and Drupal core together. Just the (100% translated) Drupal core Hungarian translation is 160k compressed with tar.gz. This is around 2000 strings. Now Bruno says all contrib modules for Drupal 5.x top at around 20000 strings. That would be a 1.6MB download for one language for a given Drupal version. Anyway, this gets practically hard, when taking branches into account. What if I want to use the ecommerce 5.x.1.0 version and the views 5.x.2.0 version (either because these are the latest, or I would like to use other contribs which fit with these modules)? What monolithic translation would I download. The ability to put modules into arbitrary folders (sites/all, sites/mysite.example.com and so on) also makes it hard to distribute updates of such monolithic packages, which the user should break apart on his site to different directories.

4) Now it that I commented on (5a), it does not seem to be more attractive to package translations and even the code together. There are some additional problems on top of the (5a) reasons. Those would be huge install profiles. It does not seem to be logical to download a huge package with dozens of modules I don't need, and would delete them, if I would know I am fine with deleting them. Newbies will not know what they need, what should be removed, and packaging stable (established, well maintained) modules and shaky ones is not going to help that much.

5b) Packaging translations by project release seems to be the most granular, which allows people to download the version of translation they need for the module they use. We will be perfectly able to do this (and would provide translation updateness information on the string level for each project and release to help people decide).

6) The question is how would we go about this without having release nodes (and multiple project nodes) for translations. The idea is to track translations on the string level to allow maximum sharing between project translations (Drupal itself enforces that sharing with only allowing unique source strings). So we would have the release node which the translations are related to (the release node of the module project for example), and then we would have the modification dates of the strings. It is unlikely that we would implement specific release tags for translations either in a CVS or web based solution. As said, we are trying to simplify people's lifes.

7) Once again, the web based solution works on the string level. The development snapshots could very well include an updated POT, and they have a release node, so we can relate strings to it. These relations could change from night to night, but the code should be able to deal with that. There is no other way we could implement support for new Drupal releases for example.

So the packaging/update question really goes down to finding a good compromise with most of the advantages of the proposed solutions (from less granular to most granular):

  • monolithic language based install profiles with possibly all contrib modules: huge, confusing, unstable, branches unclear, moving files around is a problem, changes often when any module changes
  • monolithic translation packages per language with all translations of modules only: big, branches unclear, moving files around is a problem, changes often when any translation changes
  • packaging translation files with modules: breaks the "do not touch released code" principle but easy and quick on the user
  • packaging translation files for modules in a separate package: goes nicely with the this principle but means additional downloads for all modules I download per language
  • no packaging, web based updates: needs online accessibility, less control over what I have, network timeouts, huge imports when adding a language or starting a fresh install

I think that the top items give too much unwanted stuff to the user, the items at the end of the list give too few. No wonder that translation packaging with modules is in the middle... Haha... (I might not be as much attached to that one as it seems :) Just brainstorming (if anyone reads these ramblings :)

Really we relate translations to project releases. When someone downloads a project release, she wants a certain set of languages to be available. Either no language, if the English default is fine, or a few languages if there are some languages to use that release with. Previously this was packaged with the project releases, but by principle it is not a good idea to have changing packages as translators update their stuff. So how to bridge this gap? Maybe we can offer project release downloads in two forms: one without any translations, and this would keep being the same forever; and one with all the translations, which would update with a new timestamp if translations are available. Update_status would look if you have the version which is ever-changing and would warn you if an updated version is available... Well, the obvious problem is that you would get altered to update even if some other translation changed which you did not use.... Hah... This could only work if the update_status implementation for translations would take the enabled languages into account, and only warn that an update is available if the used languages have changes.

Location modules?

brmassa's picture

Gabor,

what happened to location modules (old live translation)? its not listed anymore...

regards,

massa

in heavy development

Gábor Hojtsy's picture

You probably mean localization modules, not location modules.

As we have discussed in email, I reignited development under the l10n_server and l10n_client names for consistency. Already adapted potx module to the l10n_server needs, and doing heavy restructuring, making the modules much more user friendly and putting more polish into the fantastic start you provided us with. Updated modules will be available in a few days, at last sometime early next week.

My focus is on better reuse (refactoring lots of copy-pasted code, reusing what Drupal core offers), more themeability, less inline markup, better user documentation, improved user interfaces, properly working breadcrumbs, action tabs, an even more intuitive translation interface, directed PO import and export and so on. Most of these are already done. That will show the general UI direction taken, and the project module realignment can be implemented in the second phase.

Translations

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week