ISO Date Format

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
arlinsandbulte's picture

I know this is going to be a controversial topic. So I am going to head off the battle and start the discussion myself right away.
KarenS has proposed adopting the ISO format as the only supported format for future versions of the Date module (7.x-3.x contrib & 8.x core!). I am 100% in favor of this proposal.

It simplifies support & development & multi-platform compatibility. Of the many date format standards (unix timestamp, datetime, etc), it provides the most flexibility as parts of the date & time can be omitted to control the granularity & ambiguity.
PHP's new date functions make working with dates of any format easier than ever before. So, I don't think the traditional unix timestamp format holds an 'ease of use' advantage like it once did.
However, we must also acknowledge the drawbacks & limitations of this proposal (putting on my devil's advocate hat):

  1. The ISO date format is not natively supported by all databases (like MySQL). So, we plan on storing the ISO date value as a varchar string in the database. Date queries can be made using substring queries to find certain dates (for example, all dates with year 2012). Functionally, I don't think thre is any problem with that. It should be possible to implement that plan, while also supporting just about any database type with minimal or no database specific hacks or work-arounds. But, what about performance? How much of a performance hit is there performing substring queries against a 'real' native database date value?
  2. Drupal core (<=7.x) uses the unix timestamp format to store date & time information, most notably, for content created/updated dates. Going along with the current trend of making all content parts fields (body, terms, even titles), it might be desirable to make the created or updated date a real 'date field' too. What are the implications of this? Would all Drupal dates (like node created/updated dates) be converted to the ISO format? Or would we keep 2 different date types mixed into core?

Comments

I think ignoring the existing

tim.plunkett's picture

I think ignoring the existing timestamps for now is the best way forward. Seeing what comes of the Entity Property API discussions might just resolve that for us, anyway.

Agreed

arlinsandbulte's picture

Crell's comment at http://drupal.org/node/1346214#comment-5291030 is a great example.

Anyways, I agree we don't need to tackle existing core timestamps now. I just want to keep it in mind in case there is something we can do now to make our lives easier in the future.

I don't think the traditional

Dave Reid's picture

I don't think the traditional unix timestamp format holds an 'ease of use' advantage like it once did.

If you're talking about individual date fields yes. But once you get into bulk dates, querying in the database, etc, I swing way onto the side of UNIX timestamps. PHP date functions can't really help in those cases.

Senior Drupal Developer for Lullabot | www.davereid.net | @davereid

Actually ISO dates bring us

KarenS's picture

Actually ISO dates bring us something in queries too. There is no way to get the 'month' or 'year' of a unix timestamp without using database-specific functions like FROM_UNIXTIME. If we use ISO dates and store them as varchar we can still retreive 'month' or 'year' in an easy, cross-database-compatible way (it's just a substr). As we try to support more and more databases I think timestamps are the path to perdition. Both because we have to support a mess of non-standard database date functions and because they all have implicit timezone adjustments in them. Both Date and Views now have to explicitly set the database to use UTC before doing any query to prevent the query from being skewed by a timezone adjustment that may not be the right value. All that mess goes away if we use ISO dates.

The only downside of ISO dates store as varchar fields is that they are string instead of integer, and I know that is a little slower. It would be good to get some performance statistics on that to see what we are really talking about there.

A blast from the past

arlinsandbulte's picture

This was before my time, but I think early versions of Date actually stored each part of the date (year, month, day, hour, minute, second) in separate database columns. That strategy allows integer storage format.
But I HIGHLY doubt that strategy is worth it. And, unless proven otherwise, I would NOT support it.
/me (-) votes my own post :-)

How will sorting work with

joachim's picture

How will sorting work with the ISO dates?

Would there be any mileage in having a second column in the field table to hold the equivalent timestamp to use on sort queries and < and > filtering?

That's an interesting idea

KarenS's picture

That's an interesting idea and worth exploring.

Could you maybe give more

joachim's picture

Could you maybe give more explanation of what the ISO format is, for those of us who are still stuck in the D6 Date module system of datetime, timestamp and the third one that nobody used ;) IIRC from those, datetime was the one that allowed partial dates. Does that mean ISO is datetime or not?

ISO format is the

KarenS's picture

ISO format is the YYYY-MM-DDTHH:MM:SS format that we store in a varchar field. There are more details in the Roadmap, including a link to the standard: http://groups.drupal.org/node/221229.

Basically, it can be stored in a varchar field, which is universally available in all database types, unlike a native datetime field. A native datetime field looks similar, but is not at all standard across database types.

It is human-readable, unlike a unix timestamp.

It can be queried using substr, irregardless of the database type. You can't pull the 'month' out of a unix timestamp without using database-specific functions like FROM_UNIXTIME.

It can represent any period of time. If you truncate it to show only a year, month, and day, it has no time. Basically all the date parts can be be made optional, so it works well for historical or partial dates.

Ah so it was the third

joachim's picture

Ah so it was the third one...

That's very helpful. Thanks!

Just to clarify:

arlinsandbulte's picture

IIRC, the ISO standard does allow date parts to be optional, BUT you cannot keep the finer-grain parts without the larger parts.
For instance, if you truncate to day, you MUST keep Year & Month.
From Wikipedia page:

For reduced accuracy,[6] any number of values may be dropped from any of the date and time representations, but in the order from the least to the most significant. For example, "2004-05" is a valid ISO 8601 date, which indicates May (the fifth month) 2004. This format will never represent the 5th day of an unspecified month in 2004, nor will it represent a time-span extending from 2004 into 2005."

Within the Field UI, there

tim.plunkett's picture

Within the Field UI, there are three options:

  • Date
  • Date (ISO Format)
  • Date (Unix timestamp)

In the same order, here is how they generally referred to in code:

  • datetime
  • date
  • datestamp

See http://drupal.org/node/262066#date-fields for more info on this.

Core already has its own usage of Unix timestamps, but no field type for it. I don't think we should attempt to convert those away from timestamps, but I see no reason to add a field type for it.

So datestamp and datetime are the ones to leave behind, and the Date (ISO Format), machine_name date, is the one we should be keeping.

Furthermore, I think for our own sanity, we should use the machine_name of date_iso, and before release we can possibly rename it back. But it's just too confusing as is.

Technical arguments are not clear at all

sun's picture

I'm really missing clear technical arguments here.

The ISO Date format syntax, as proposed here, primarily makes sense for the sake of displaying and communicating date information across otherwise independent parties (e.g., between backend (PHP) and frontend (JavaScript)).

It's not really suitable as a syntax and/or data type for storing and querying date data, because the syntax is too flexible (and thus not really a data type, other than "text").

  1. What is the actual difference between the Date/Datetime and the ISO Date format?

    ...aside from the "T"? http://en.wikipedia.org/wiki/Iso_date_format describes many more delimiters in the ISO syntax, but I somewhat doubt that we're going to permit any use of them - for the sake of being able to query data in a predictable/consistent way?

    E.g., if we'd allow the alternative week date syntax (2012-W13-5) or ordinal date syntax to appear within calendar/date/time syntax, records cannot be filtered by date anymore, since you need to perform advanced date calculations to equalize the data first.

    (The same applies to omitting more granular components of the date/time, since any string-based filter/sorting operations will fail on missing components.)

  2. Substring operations on a column in a table holding millions of records (example) will be horribly slow.

    That is, because the database engine cannot use any indexes or native functions to optimize the query. Meaning a performance impact of seconds (not milliseconds) for a single database query across a large data set.

  3. Given the above, I don't quite get what's wrong with the native DATETIME or equivalent data type of database engines?

    The SQL standard defines ISO 8601 as default format, which is pretty much the same.

    http://dev.mysql.com/doc/refman/5.1/en/datetime.html
    http://www.postgresql.org/docs/9.1/static/datatype-datetime.html
    http://www.sqlite.org/datatype3.html

    (yes, SQLite only supports a text string, but SQLite doesn't support many data types either way, so possibilities are very limited on that engine anyway.)

    A proper data type allows for native filtering and sorting of values, which can be indexed. Handling in SQL queries is as simple as dealing with UNIX timestamps.

    Individual (more granular) date components cannot be omitted, but for the sake of concise and correct data storage, properties like "allday" and similar should be separately stored flags per date either way (and are use-case specific additions as they do not apply to all dates in the first place; e.g., not for content created and modified dates).

Daniel F. Kudwien
netzstrategen

Database native datetime

KarenS's picture

Database native datetime fields were removed from core in D7 because it would be too hard to truly support native datetime across all database types. See http://drupal.org/node/866340, where it was removed, and http://drupal.org/node/293483, where there is discussion about adding more field types to core where the position that core will never have a datetime field has been made quite clear.

I'm tired of trying to support this in contrib and this is about preparing something that can be used in core.

Also datetime fields still can really only be used for complete dates. If we want to support partial dates (and that is a common use case), it isn't sufficient. The ISO format supports anything. And the varchar field is universal, so the combination will allow us to store any type of date in any database.

I wouldn't mind using a

KarenS's picture

I wouldn't mind using a native datetime field if we could get it back into core. But I already tried fighting that fight and lost. And I'm not going to create a solution that uses a native datetime field when it's been made abundantly clear that the core committers won't allow a datetime field in core.

I suspect we will need to

arlinsandbulte's picture

I suspect we will need to limit the stored ISO format to the YYYY-MM-DDTHH:MM:SS form to maintain consistency and easy searches and sorts.
Perhaps even use the ±YYYYY-MM-DDTHH:MM:SS format to allow for a wider range of historical & future dates, which has been a common request in date.module. But, PHP's date handling might be a limitation there anyway.

Also, even though this ISO standard allows dropping less significant parts to reduce accuracy, doing so might make sorts and searches more difficult... for that reason, we might still be required to store 'something' for each part. (Please correct me if I am wrong. I am not a database expert).
-Wild idea: We could store '99' in any un-needed date part. That works for all parts except year, but also makes the stored date string non-ISO compliant.

Conditional +1

David Strauss's picture

The way ISO 8601 handles time zones is pretty brain-dead. The time zone is a user/content item localization concern, not something that should be pre-adjusted in the stored date with annotation about how the date differs from UTC. It makes the data hard to read and breaks string sorting (which is otherwise solid with ISO dates).

I can only support this proposal if we stick to using UTC/Zulu time in the database. The primary thing I've loved about our history of using Unix epoch timestamps is knowing they're always in UTC.

Storage?

sun's picture

If you (and other database performance experts) are fine with storing ISO dates as varchar strings, then I'm fine with that direction, too.

Daniel F. Kudwien
netzstrategen

Whether we always store ISO

David Strauss's picture

Whether we always store ISO 8601 date/time values in UTC/Zulu time and whether we use native date/time types in the database are separate questions. I just slightly prefer native date/time types but with UTC/Zulu time. Time zone compensation should generally happen with PHP's rather rich facilities, but I don't want to rule out developers having access to rich in-database querying options (e.g. SQL for filtering/sorting by day of week or month).

Database portability will be easier with pure ISO 8601 UTC/Zulu values, too. We won't have to rely on inconsistent implementations of time zone handling and whether the database sorts lexicographically or temporally.

Yes, the question about

KarenS's picture

Yes, the question about when/whether to convert times to UTC for storage is a separate question and we need a separate discussion about that. Although ISO has an option to add timezone information to the string, I'm actually not proposing that we do that, I was just pointing out that it is possible. I'm planning to open a separate issue about that topic (or anyone else can).

The ISO standard has a lot going for it, lots of people have already thought through the things that need to be represented in dates and created a universally-understood format for it. The native database datetime fields actually use it too.

I do want to add the +- prefix to it so we can support BC dates, and that will give us pretty much everything that has been requested over the years. Conversions from/to this format from other formats can be handled with PHP. And timezone conversions can be handled with PHP.

The remaining issue that might be a concern is the performance of the string vs int (timestamp) and native datetime fields, so I started a separate discussion about that to figure out what the real cost is and what we can do to make performance as good as possible. But when considering database portability, ISO in a varchar field is a big winner.

Question about +- sorting.

arlinsandbulte's picture

Will adding the +- pose a problem for sorting?
Without +-, sorting a YYYY-MM-DD string ascending naturally results in the right order (earliest date first & latest date last).
BUT, adding the +- changes that. A simple ascending string sort results in all + dates first, sorted with earliest dates to latest dates, followed by all - dates, sorted with latest dates to earliest dates.

For example, here is a sorted list of year-only strings:
"+0000"
"+1800"
"+1900"
"+1977"
"+2000"
"+2012"
"-0100"
"-0500"
"-1000"
"-5000"

Repush for Date, Time and DateTime DB support?

Alan D.'s picture

It seems crazy to regress and not support these DB fields. Excluding all of the add-on stuff in Date, it is a massive module and on all other systems I have worked on, the Date support classes are trivial classes of 1000 or so lines of code.

So this is really a pointed question directed to DB driver coders, could a DateTime field be converted internally to support a pseudo-DateTime field?

(End rant)

Multi-columns

Splitting out the columns into individual components works, but is very messy (as I have done in Partial Date), and really would not recommend going down this path. I feel like I shot myself in the foot doing this and will look at reverting this in future releases.

Signed columns

Without rereading the iso specs again (or running tests), the positive sign is optional if I remember right, so the minus sign is needed which negates the sorting issues if I'm reading the ASCII tables correctly. (- is 45, and 0 to 9 range from 48 to 57). Using the + (ASCII 43) breaks the sort here.

Timezone support (or lack of)

I agree that only using a single timezone in the database is best (UTC) as it negates some of the negatives in ordering and searching data when not supporting native date db fields.

One Two rings to rule them all?

I agree with Sun, we should limit this to a very simple sub-set of the supported extended formats. Maybe only

-{0,1}Y{1+}-MM-DDTHH:MM:SS

While not supported currently with the Date module, timezone information does have meaning on dates, 2012-01-01 in America is nearly an entire day behind 2012-01-01 in Australia. So maybe the field support can be reduced down to an ISO-like field and an ISO-like field with timezone support? This drops time support, but I don't think that core should support Time fields.

Finally, Drupal 8 feature freeze: December 1st, 2012

Very old thread so probably

liquidcms's picture

Very old thread so probably all gone by the way side now... but i am just porting over a "reservations" module from D7 to D8. Module makes extensive use of EntityFieldQueries doing a date filter "between date1 and date2". I think this worked because the date fields in the db were of type datetime.

Now that we have switched these database values to varchar, is there any way to do an entity.query condition to do my filter?

I would suggest you take a

vensires's picture

I would suggest you take a look at this article: https://www.webomelette.com/query-entities-using-dates-drupal-8. It concludes to the following code but reading it as a whole would be better:

<?php
$timezone
= drupal_get_user_timezone();
$start = new \DateTime('now', new \DateTimezone($timezone));
$start->setTime(16,0);
$start->setTimezone(new \DateTimeZone(DATETIME_STORAGE_TIMEZONE));
$start = DrupalDateTime::createFromDateTime($start);

$end = new \DateTime('now', new \DateTimezone($timezone));
$end->setTime(18, 0);
$end->setTimezone(new \DateTimeZone(DATETIME_STORAGE_TIMEZONE));
$end = DrupalDateTime::createFromDateTime($end);

$query = \Drupal::entityQuery('node');
$query
 
->condition('field_date', $start->format(DATETIME_DATETIME_STORAGE_FORMAT), '>=')
  ->
condition('field_date', $end->format(DATETIME_DATETIME_STORAGE_FORMAT), '<=');
$results = $query->execute();
?>

Date API

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: