Posted by chipk on August 13, 2008 at 9:32pm
We have a large Drupal 5.1 instance that needs to be upgraded to 5.9. However, the required run of 'update.php' over a DB of 160K+ nodes has not come close to completion, having tried a couple of time to leave it running on a development box. Does anyone here have a suggestion on how best to execute all DB upgrade logic across a very large data set?

Comments
Error text would help but
Try
memory_limit = 128M ; Maximum amount of memory a script may consume (128MB)
in php.ini for apache2
Php timeout too
There is a php max execution time variable in php.ini as well increase it to something big. I think it typically defaults to 30 seconds
bigger server - pinpoint the problem areas - do it in pieces
Either you're going to need a bigger server or to pinpoint the problem areas. tarvid and ghanstef have provided good first steps towards making it more likely to finish. But in addition to just increasing various configuration options, I'd also suggest a more scientific approach of monitoring the server (top, vmstat, etc.) to figure out exactly which server resources are the bottleneck - then you can increase those, re-run, and hopefully find how to fix the problem.
Also, while not entirely recommended it is possible to run the update.php in chunks. I imagine that there are one or two module updates which are really causing the problems. Perhaps if you can disable those modules, run all the other updates, and then run just that module's updates it will be more likely to run or you can dig into what that module is trying to do and perhaps figure out how to do it in a more efficient way.
--
Growing Venture Solutions | Drupal Dashboard | Learn more about Drupal - buy a Drupal Book
knaddison blog | Morris Animal Foundation
I was able to narrow down
I was able to narrow down the problem to a few modules (cck, forward, etc.) allowing me to get a partial upgrade deployed fore core and most modules - i.e. those upgrade tasks run in just a few seconds. My sense is that the upgrades effecting node-related tables on a large DB (160K+ nodes, 450K+ url aliases, etc.) will need to be handled offline after further research. With Drupal becoming popular for some large sites, it is disappointing more detail is not available around this crucial issue.
disabling indexes for the
disabling indexes for the updated tables in db might also help in some cases: ALTER TABLE table_name DISABLE KEYS and once you are done ALTER TABLE table_name ENABLE KEYS
That's an interesting idea -
That's an interesting idea - do you believe re-ENABLE-ing the keys will trigger MySQL to re-generate the indexes if any indexes are part of what is updated?