(even more) Advanced Cron

DamienMcKenna's picture

I've toyed with this idea for a few months..

A common problem with cron tasks in Drupal is that it is generally an all-or-nothing proposition, either every time you run cron.php the system runs every individual cron task, or (possibly for n00bs) you don't run it at all. Then there's the problem that some tasks need to be ran frequently, e.g. search index refreshing vs tag cloud updates vs RSS feed aggregation vs scheduler updating..

What I propose is rewriting cron.module to allow individual cron tasks be scheduled differently.

There would be three parts to it:

  • At initialization, cron would search for all hook tasks and record ones it was previously not aware of; hook_cron() would be used directly to identify these.

  • Each cron task would be executed based on its individually configured schedule; by default each task would run on each cron schedule (schedule=ALL), though this could be adjusted using a new (optional) hook_cron_info() call which could e.g. tell it to run on all cron executions or maybe just once per day.

  • After each cron task was executed the system would record: a) how long the task took to execute, the last time it executed, c) the task's return status (success, fail, $err).

To work with this there would be an admin settings page which would users with "administer cron tasks" permission to modify the schedule:

  • All tasks would be listed out individually along with a description obtained from an (optional) new hook_cron_info(); this would list its (optional) default along with a textual help/recommendation, e.g. "only needs to run once per day unless you are also using the xyz module in which case you may want to run it more frequently."

  • Each task's default would be listed, per above.

  • Each task would have a way to assign the schedule. Am unsure how exactly this would be done, maybe as a crontab-like configuration of just entering numbers, or maybe a fancier (maybe limited?) set of selectors to define the regularity at which they are executed.

  • Each task would also have a checkbox to control what to do if the previous execution failed: do not run again until the admin re-activated the cron, continue as normal, or re-run on the immediately next cron run regardless of schedule (useful for dealing with FTP upload timeouts).




If we're going all-out

mlncn's picture

Some sort of a hook_cron_alter would be nice, if only for strange things like disabling other modules' cron actions.

I see the use for most of this; i'm a little unsure about adding any weight to cron, though. I think just about all of this would require changes to core.

benjamin, agaric

PHP Daemon

kyle_mathews's picture

An interesting alternative to this problem. This isn't core worthy as you need to install a PEAR package -- but a colleague of mine recently created a PHP daemon to run our cron. Our problem was we wanted messaging + notifications to run much more often than the rest of the cron jobs. So we set up the daemon to run messaging/notifications every 30 seconds and the rest of the system every 10 minutes. So far it's been running great.

I imagine the system could be generalized further so that there'd be an admin interface where you could define how often you wanted each modules cron to run as you suggest.

Kyle Mathews

Kyle Mathews

My idea was to start with

DamienMcKenna's picture

My idea was to start with drupal_cron_run() and provide a replacement cron.php.. and provide an optional patch to remove the core drupal_cron_run just in case.

It's an interesting idea. My

Garrett Albright's picture

It's an interesting idea. My thoughts;

If users can set arbitrary times at which cron tasks can be run, then it seems like cron will have to be run (at the OS level) every minute, at which time the cron replacement (at the Drupal level) would see which tasks if any should be run this minute and run them. It seems like kind of a brute force solution.

I also don't know if allowing users to specify when they'd like jobs to run is totally a keen idea. I think there's some potential for confusion or abuse. Maybe this is an option that could be specified in a module's own settings page instead of one grand "schedule cron tasks" page. Maybe I'm just shortsighted, though.

Maybe instead of specifying precise intervals or times, modules could say something along the lines of "run if X minutes have passed since the last run." That way, when cron is run at the OS level won't have to be so precise. I've already done something like this in some of my modules; I'll cache the timestamp when my hook_cron() function has run, and, if on the next call to my hook_cron(), not enough time has passed since that timestamp, the function just returns without doing anything.

Thanks for the feedback

DamienMcKenna's picture

Garrett: Thanks for the feedback. You have a point on the workflow for how to decide if tasks should run or not, I'm undecided on this because there are several methods which might work.

One key aspect of my idea would be to allow each module to provide an intelligent default, e.g. some tasks may be expected to only need to run once per day (xmlsitemap) whereas others (Boost) may need to be ran every two minutes.

Job queue

alex_b's picture

Have you looked at job queue module? IMO jobs that need to be scheduled should be pushed into a central queue and then worked off from there. Thus Drupal can have some awareness whether jobs are getting finished.

Also have a look at a recent core patch for a queue module (http://drupal.org/node/391340 - there is still some discussion whether this should contain a scheduler or not).


We are actually working on

MisterSpeed's picture

We are actually working on such a module for D6 and just found out about this by bugging catch on IRC. Would you like to team up ? (Replying to the thread, not the last comment; need sleep)

Hey 63reasons

DamienMcKenna's picture

63reasons: it's hard to catch you on IRC if you don't let me know what your nick is :) Please update your d.o profile, or catch me online. Thanks.

Good Thoughts

drupdrips's picture

Thanks Damien for starting this thread , some good discussion.

I like Garrett's idea : ""run if X minutes have passed since the last run." for non-precise but frequency based task runs. Obviously in such cases you would not hope for module task that needs to run in much closer frequency to be controlled by a cron run that is on a very wide window, but as long as its the other way around, it will serve the purpose well.

For precision oriented tasks with enough flexibility a true scheduler / job queue would be preferred, I'd think and on that note alex_b's reference to job queue module seems to make sense, although I have not used that module.