Performance Difference Between Rules Scheduling Architectures

Events happening in the community are now at Drupal community events on www.drupal.org.
eric__'s picture

I am using Rules to process changes in user role expirations for a paid membership site.

The question I have can be applied to the more general case of acting upon an entity in the future. I believe there are two different ways to accomplish this. The first method is via a scheduled rule. The second method is via a date field and views query.

Options
The first method is to create schedule rule for every entity. In my case, when a user buys a 1 year membership a rule schedules another rule at +365 days to remove the "member" role.

The second method is to run two rules that interact with an e.g. "expiration date" field. Again, using my membership example, one rule would set an expiration date and an 'expire' rule would be run via a views query (selecting only those members who's expiration is earlier than today).

Discussion
In our case, we also need a number of "upgrade now" emails sent to each user who's membership is about to expire so in actuality we will likely see the need for ~5 separate rules run per user before or on their member expiration date.

We're hoping that either solution will scale to at least 100,000 members. Naively, we'd expect 1/365 of our users trigger those rules every day. Realistically, we'd expect as much as 10% of those users triggered in a day.

To me it appears that the second method is less complex as persistence (of the expiration data) is moved from a rule to a date field. While I'm concerned with outright site performance I'm also concerned with usability when we have > tens of thousands of users.

Comparison
As I see it, the benefit of the first method is limited database queries (but I'm not sure where scheduled rules are kept so I may be wrong). The detriment is that we'd have ~5 times the number of rules as users and at some point that could become untenable.

The benefit of the second method is that persistence is handled in a plain old data field as opposed to a scheduled rule. Naively, this seems less complicated.

It would also seem that we'd have more fine grained control of when processing is done in the second method. With the first method we only know about each individual scheduled rule and we are limited to whatever we know when we schedule the rule. With the second method, we know how many times we are going to run the rule (and we can know other things like server load). We can then tweak when the view / rules run.

Overall I lean toward the second method because it appears to be near identical performance and less complicated.

Question
Does anyone have input about the performance / usability differences here?

Thanks so much,
Eric

Comments

Second sounds good

itangalo's picture

Note: I have little experience of performance and scaling with Rules, and I can't promise that my advise is good.

That being said, I think the second approach makes more sense thatn the first one. I think you make a very good analysis of the situation – on a smaller site with fewer members, the first approach would clearly be the best one. It is straight forward and gets the job done. However, if you fear that you will have to send out 100+ e-mails on one cron run, or you want to always send out e-mails at midnight, or you want to be able to easily change the time when e-mails should be sent – the second approach wins. It gives you much more control.

I'd take care to use batch/queue API for sending out e-mails, and if you want to avoid a lot of work with spam list prevention I'd also use an external SMTP service.

It would be interesting to hear more about which approach you choose, and what unforseen problems arises (if any).

Good luck!
//Johan Falk
**
Check out NodeOne's Drupal Learning Library! 250+ screencasts and exercises, for Drupal introduction, advanced configuration, and coding. Creative Commons license!

Thank you Mr. Falk

eric__'s picture

I'm glad to hear your opinion. Unfortunately, there's just too many assumptions to get much out of testing, making "engineering judgement" so much more valuable.

We'll be developing with the second option. And thanks for the heads up on the batch API and external SMTP.

I'll post back once we have some experience with the second option.

Eric

Rules

Group organizers

Group categories

Categories

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: