EOL Taxonomy Sprint: Goals and Progress

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Goals

The goals for this Encyclopedia Of Life sponsored sprint are many, but the primary focus is to be able to attach metadata to terms, permit better handling of relationships between terms, and deal with the user interface and performance implications of large taxonomies.

Use Cases

  • A vocabulary with millions of terms: ability to move terms, search terms, even autocomplete.
  • Attaching meta-data to terms: like images, location information, species information, flags

Projects

Core taxonomy improvements for D7

This is likely to include a number of small interface improvements, a cleanup of the menu implementation (which has some bugs needs tidying up), hook_taxonomy_term_load() (and save, delete) for cleanly adding stuff to the term object, and improvements to the way taxonomy hierarchy is stored for retrieval of large trees.

See Taxonomy Code Sprint core issues for links to the individual patches.

Taxonomy XML (import/export)

Taxonomy XML has been upgraded to D6, and extra import/export formats have been added.

Taxonomies can be exported, shared, and merged. Import or re-import of overlapping vocabularies or shards of vocabularies can be knitted together.

Each format/dialect is in individual support libraries that can be written and added as needed.
As well as CSV and RDF formats, a reader/writer for the highly technical "Taxonomy" format standard TCS - Taxonomic Concept Transfer Schema is in development.

Status Sep 10:

  • D6 Port working,
  • D5 Branch available.
  • Merge, RDF, CSV supported,
  • some sample docs (Dewey system, Countries and States) included in the package.

Done, In Testing:

  • Import from URL (instead of file upload)
  • TCS Import - Basic structure working,
  • Storage of term metadata (in association with taxonomy_enhancer) framework available. Actual usages coming
  • TCS Export - Publishes a 'web service' type feed of machine-readable term definitions

TODO: ?

  • Partial tree or per-term exports (currently only dumps the full vocab - this cannot scale into the hundred-thousands)
  • Resolving external references. For scaling, we need partial tree imports. These should point to URIs where more data can be found. Ideally we want to automate (cron or batch) this recursive lookup, so as to handle HUGE taxa.

~dman

Taxonomy enhancer for D6

Taxonomy enhancer has been ported to D6. A few bug fixes need to be back ported to D5. Originally, it supported text fields. Now it supports multiple-value text fields with select and option widgets. Select lists for node references but no autocompletion yet. The module implements the proposed hook_taxonomy_term_load.
Patch with D6 port posted at: http://drupal.org/node/305736

Term relation types

Term relation types was already highly functional before the sprint. The added implementations of hook_taxonomy_term_load and hook_taxonomy_term_save are very new, and still need to be tested. Term relation types keeps relationships between terms using the same term_relation database table used by taxonomy.module. Data about term relations are stored in other tables.

The module allows relationships to be typed, and they are related only one-way. This contradicts the usage in taxonomy.module, so the term relation form element on the taxonomy.module's term form is disabled. The term relationships are stashed in a form value and re-inserted when the term is saved. Enabling, using and later disabling this module could cause term relationships to be lost when they are loaded and saved in the taxonomy.module's forms.

Taxidermy module

Taxidermy module bridges the core taxonomy.module and the proposed(?) taxonomy.module improvements for D7 by providing hook_taxonomy_term_save and taking over various taxonomy pages via the menu system. Modules can depend on taxidermy module to support modules that implement hook_taxonomy_term_load and other hooks for forwards compatibility with D7 (assuming the core patch gets in). This module aims to imitate precisely those function arguments and return values for the D7 taxonomy.module. This module relies as much as possible on taxonomy.module functions to modify database tables. It may also be compatible with Drupal 5.

Functions

taxidermy_term_load($tid)

Returns a complete term object. All parents, relations and synonyms (core taxonomy.module features) are loaded. Hooks are triggered so modules that implement hook_taxonomy_term_load($term) can add their values to the object.

taxidermy_term_save($term)

Returns SAVED_NEW or SAVED_UPDATED. Hooks are triggered so taxonomy term modules that implement hook_taxonomy_term_save($term) can store their properties.

Taxonomy Manager

Worked on forms in the Taxonomy Manager to use new Form API features of Drupal 6. With the changes other modules like the Taxonomy Enhancer can be easily integrated.

Notes

Also, we are using #drupal-taxonomy on irc.freenode.net for discussion during the sprint, and is henceforth the official namespace for any public taxonomy event.