Automatically fetching translations
Hi!
In D5, we introduced Autolocale module, enabling users to import translations after enabling module / during installation. However, this had few caveats, as timeouts on cheap shared hosts, etc.
In D6, Autolocale is built-in core, using batch API and just works very well.
What do you think next step should be? From my point of view:
"Let Drupal automatically fetch translations from a server".
Imagine an user, enabling a module. Drupal polls Drupal.org (or something else), downloads a translation, imports it. No need to download any package, packages for modules will be smaller.
This could be done for installer and/or contribs.
What do you think and how do you think we can do this?
Just my $0.02:
- Installer is a great place for this, contrib may not be (Translations are already packed with contribs)
- Gabor mentioned (at DrupalCon in Boston) that it may be done using Localization client. Is it wise? Any other approaches?
- I can imagine a very easy implementation using curl()/fopen()/xmlrpc() (having a failback) to query Drupal.org/... for a translation, download it, extract (how to do this without Archive/Tar.php?) and import.
Please post your ideas so we can eventually get this to D7 :-)


Comments
First, comments to the initial post:
xmlrpc(), at least if importing a few strings at a time (per the above). If we're importing entire core/module translations, it might be better to usecurl()or some such and fetching regular PO-files.fopen()on remote files should not be assumed to be allowed/possible to use on a given server, as it could easily be disabled for security reasons withallow_url_fopen(one should always be wary of opening remote files). Also, if we're fetching text/plain PO-files, we wouldn't need Archive_Tar to handle the downloads.Second, my own thoughts:
--
Frederik 'Freso' S. Olesen
not hitting servers hard
I fully agree this is a place we should move to. My main concern is to not hit on the server too hard. Having many servers for languages puts pressure on the individual maintainers, so having a central server gets off pressure from the individual teams (and also lets us fetch from one location instead of multiples with differing reliability). From then, we could have too broad needs (big file to transfer) or too specific (complex query on the server). The first is "All for this version of Drupal core", which would be a huge file, especially if we could not do compressed communication. The second issue is if we want something like "x version of module y, z version of module o", etc. If we do micro-polling, as Freso suggests, that would get less burden on the server per call but would continually hammer the server. Imagine hundreds and thousands of Drupal client sites calling back and doing micro-polling. In some cases, the extra baggage of the request might even be bigger then the useful data.
So anyway, I think we need to think hard about scalability before we launch such a system wholesale :)
Requesting diffs // Multiserver distribution
I'm very interested in a clean distribution workflow of translations as i'm working on pages that use 3 languages and more.
Apart from all other ideas in my mind for a perfect translation distribution, i think there are a few key features to provide best results:
Thus companies may run their own servers to distribute own (modified?) translation among networks..
BTW: I'm argumenting for the capability of own intermediate servers because original translations are not always fitting perfectly customers' needs. If a customer wants e.g. term "member" instead of "user" in a specific language, there are pretty many individual strings. As of my understanding a perfect translation server should also be able to pull translations from a parent server, being able to modify it and redistribute it again. The way upstream would also be an option - to commit new translations to an optional parent translation server.
Possibly a single core server (drupal.org) to support user specific project based overrides would be an option too. But this starts to become very complex and the multiserver approach would make many things much more simple.
What do you think?
Would you consider a commercial solution for translation?
I know that the natural tendency with open source systems would always be to use an open source solution, but I'd like to mention our commercial solution for translation because I truly think that it can help in this situation.
We're developing a system for Drupal translation. It's a commercial system and we're making our living from it. The system handles content translation. It does everything you (and your clients) need including intelligent content change detection, collaborative translation and other critical features.
The idea is to allow Drupal sites that require content translation to run without any effort on the side of the admin.
The drawbacks of using a closed-box solution are easily balanced by the advantages of having a very committed group of developers supporting and developing the system on full time.
Although it's a commercial system, we're offering it for free for open source and not-for-profit organizations.
Would it be an unholy thing to talk about commercial systems in this group?
Agreed: Multi-Server and diff sounds fine
@miro_dietiker: I would prefer the usage of "translation version" instead of "translation date" , as a comparison of versions is somehow easier to implement / maintain, though dates would be possible too. An alternative would be the usage of repository-servers (e.g. CVS) for translations, commits could be done automatically (e.g. translations are registered with a server, new po-files are created automatically) as this would ease the usage of existing APIs.
Allowing multiple servers? Perfect. Not only would this allow "user changed translations of modules", but it would also ease the administration of e.g. menu-terms and the distinction between "translator" and "technician".