Hi all,
I'm wondering if someone else has come to this issue. We need to generate a birthdays view. We're using a Date field in the user profile so users can enter their birthday date. However, querying the date field to retrieve birthdays may impact performance, I think. The view could be cached, but still, when executed against a database of thousands of users, it could impact. So I'm thinking about a way to speed up this kind of queries.
We're running a site that has over 400000 user accounts, and I'm working to port that site to Drupal. The site is running a phpBB based forum that will remain, but most of the profile related features of the site will be moved to Drupal, so I'll need to port a few things to Drupal, things that we're currently doing in phpBB. For example, this stuff about birthdays... apart from a integer column to store a timestamp of the user's birthday date, there is another column in the phpBB user table that's CHAR(4) and contains the day and the month of the user's birthday. That column has an index, so I can run a query to find birthdays pretty fast. And now, I'm trying to do something similar for Drupal.
My idea was to create a new CCK field that would implement a CHAR(4) column to the Profile content type. This field would not have a widget, nor a field formatter. It would mostly work as a Computed field whose job is to update the birthday column during nodeapi('presave'), computing DDMM (day and month) from a selected Date field (in field settings form) that should exist in that content type. And then, this Birthday field will expose an argument and a filter to Views. That way we would be able to create a Birthdays View that just needs to filter for this field, so it's faster than filtering against the Date field.
Does this idea sound stupid? :-)
Is there a better way to do it? Anyone else has to do something similar?

Comments
performance of native mysql date functions?
I can understand your concerns in using complex functions for date comparision.
However mysql has internal optimizations for date handling and a special field type (as long as you're not using text to store date fields...) and i'd expect them to perform very well against integrated date function comprisions (as long as you're not doing manual string cuts).
Performance is one of the main reason why date / datetime / ... fields exist.
Did you even try those date queries the native way with measurements?
Not sure what you mean :-/
Here's a page to the MySQL docs:
http://dev.mysql.com/doc/refman/5.1/en/date-calculations.html
For our particular use case, we're using Datestamp fields, as implemented by Date module. AFAIK, Date module does not provide a filter for birthdays, so we need to do something. And here's where I thought about alternative solutions because we're doing it in our phpBB forum right now, and I know WHERE column = 'MMDD' is quite faster when an index exists for that column.
Life example: look at the bottom of the following page, where it reads "Hoy cumplen años" (today birthdays)
http://zonaforo.meristation.com/foros/index.php
If there's a way to do this faster using MySQL native functions against date fields, then I could explore the possibility to provide such a filter for Date fields / Views, if there's nothing else already.
Interesting
I don't have any advice but I'd be interested in what you come up with. I'd like to do something with birthdays on my site but I don't want to use core profile to use the birthdays module.
BTW, if you ever want to get off phpbb totally, come on over to Drupal's forum. The water's fine. :)
Michelle
Well, it is not just a standard phpBB
[offtopic]
We have more than 22 million messages on this board, and sometimes we have reached more than 5000 users online (typical stats from the past 5 minutes), most of them posting, and that can only work with a lot of optimizations in the base code, and database structure. Also, phpBB requires a lot less memory than Drupal.
[/offtopic]
Back to the birthdays stuff... I think the best solution would require a patch to Date module, so it can provide this extra CHAR(4) column as an option to store 'MMDD', when date fields are configured with YMD granularity (also, no timezone handling and no repeat options enabled), and then expose a filter to Views.
However, this approach may take a long time because it needs to involve KarenS, who seems to be quite busy these days.
So, I think I'll mostly work on a hackish solution. A CCK field that could be added to content types that already have a Date field with YMD granularity, no repeat, no timezone handling, no multiple values, no almost nothing, only typical Date fields used to store birthdays in user profiles. I think APK works with one of these... ok, so this additional field would only provide storage for this CHAR(4) column, a nodeapi('presave') implementation to compute the value of the field, and then a Views filter. No widget, no formatter is needed for this kind of fields. All we need is the ability to create fast queries to get birthdays.
How about saving the birthday
How about saving the birthday as a day in the year. So 1/1 equates to 1. Then that would be super fast.
12/31 = 365.
Makes it easy to do range queries.
Then how would you handle
Then how would you handle leap year?
This won't allow us to do something like...
WHERE field LIKE 'MM%'or
WHERE field LIKE '%DD'removed
removed
hook_nodeapi
Why not use hook_nodeapi to populate your lookup field on save while still using Drupal date fields? Use batch api or a one-off script to handle existing data.
If we use a CCK field...
...then we have more chances to store the lookup field in the same table where other CCK fields are stored. If the field is not shared between several content types and it does not use multiple values. Also, CCK would handle all the storage related stuff, making the module that implements this extension a bit more simple.
CCK Birthday module committed to my sandbox space
http://drupalcode.org/viewvc/drupal/contributions/sandbox/markus_petrux/...
I'm not yet sure if this worths a formal project because I feel it is mostly experimental. At least, for the moment.
The included README.txt explains the basics.
It would be nice to know how others have approached this kind of queries, or maybe someone can provide performance tests comparing queries against a CHAR(4) -vs- using native SQL functions to operate with dates.
Could be premature optimization
How often do you need to perform these calculations ? As I understand it, the goal is to know every day who has a birthday. So basically you only need to run it once per day (maybe twice, to compensate for shifts in timezones). Then it doesn't matter how slow it is, and you could run it on cron.
Pick a Date
How to pick a date using calendar in cck .. i ve installed date module ...