Strategy to run update.php over lage DB of 160K+ Nodes

Events happening in the community are now at Drupal community events on www.drupal.org.
chipk's picture

We have a large Drupal 5.1 instance that needs to be upgraded to 5.9. However, the required run of 'update.php' over a DB of 160K+ nodes has not come close to completion, having tried a couple of time to leave it running on a development box. Does anyone here have a suggestion on how best to execute all DB upgrade logic across a very large data set?

Comments

Error text would help but

tarvid's picture

Try

memory_limit = 128M ; Maximum amount of memory a script may consume (128MB)

in php.ini for apache2

Php timeout too

ghankstef@drupal.org's picture

There is a php max execution time variable in php.ini as well increase it to something big. I think it typically defaults to 30 seconds

greggles's picture

Either you're going to need a bigger server or to pinpoint the problem areas. tarvid and ghanstef have provided good first steps towards making it more likely to finish. But in addition to just increasing various configuration options, I'd also suggest a more scientific approach of monitoring the server (top, vmstat, etc.) to figure out exactly which server resources are the bottleneck - then you can increase those, re-run, and hopefully find how to fix the problem.

Also, while not entirely recommended it is possible to run the update.php in chunks. I imagine that there are one or two module updates which are really causing the problems. Perhaps if you can disable those modules, run all the other updates, and then run just that module's updates it will be more likely to run or you can dig into what that module is trying to do and perhaps figure out how to do it in a more efficient way.

--
Growing Venture Solutions | Drupal Dashboard | Learn more about Drupal - buy a Drupal Book

I was able to narrow down

chipk's picture

I was able to narrow down the problem to a few modules (cck, forward, etc.) allowing me to get a partial upgrade deployed fore core and most modules - i.e. those upgrade tasks run in just a few seconds. My sense is that the upgrades effecting node-related tables on a large DB (160K+ nodes, 450K+ url aliases, etc.) will need to be handled offline after further research. With Drupal becoming popular for some large sites, it is disappointing more detail is not available around this crucial issue.

disabling indexes for the

Etanol's picture

disabling indexes for the updated tables in db might also help in some cases: ALTER TABLE table_name DISABLE KEYS and once you are done ALTER TABLE table_name ENABLE KEYS

That's an interesting idea -

chipk's picture

That's an interesting idea - do you believe re-ENABLE-ing the keys will trigger MySQL to re-generate the indexes if any indexes are part of what is updated?

Newspapers on Drupal

Group organizers

Group categories

Topics - Newspaper on Drupal

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: