Today at DrupalCon San Francisco, 11 of us gathered to discuss what the ideal architecture would be for Search in Drupal Core. Present were maintainers/users of core Search, various contrib search modules, Lucene, and Solr:
http://drupal.org/user/153120 - janusman
http://drupal.org/user/277371 - awolfey
http://drupal.org/user/29191 - douggreen
http://drupal.org/user/266779 - cpliaka
http://drupal.org/user/472460 - jpmckinney
http://drupal.org/user/10297 - unexpand
http://drupal.org/user/733232 - nihiliad
http://drupal.org/user/49851 - pwolanin
http://drupal.org/user/157079 - mradcliffe
http://drupal.org/user/155601 - jhodgdon
We all agreed that we want the core Search module to be pluggable/modular. Here are some notes:
* We want the whole system to be pluggable.
* Steps in the indexing/cron process
- Decide what needs to be indexed (could be nodes, other entities, ...).
- Render each item, or build a structured renderable array
- Pre-process (stemming, n-grams, word splitting, etc.)
- Index each item
* Steps at search time:
- User interface - ask user for the search query (defined syntax, faceted, etc.)
- Preprocess (as in indexing)
- Query to get search results (with ranking)
- Post-process (spelling suggestions, etc.)
- Extract excerpts/highlight
- Display results
* All of these steps should be pluggable.
* Core search would be a framework that would coordinate the steps, and keep track of what content needs to be indexed for each pluggable search index/retrievable framework
* We could also provide a google/yahoo/etc. search box, like what you get in Firefox
* We would also provide (maybe as a contrib module) a default storage/retrieval method, basically the current core search mechanism but maybe limited to single keywords (for efficiency).
* Doug is also interested in building a MongoDB implementation of storage/retrieval
* Solr, Lucene, etc. would also be able to build storage/retrieval
* Needs to be language-aware and compatible with multi-lingual sites
* Needs to be extensible, such as supporting facets as one extension, "advanced search", etc.
* Write it using PHP objects
* Maintainers of Lucene, Solr, etc. will provide descriptions of what they needed to modify in the core framework to get things working
* Hopefully everyone will tell me what I missed and got wrong in this post.
* We'll have some meetings on IRC/Skype, working towards a sprint or work session
* Keep in touch with the GSoC person who's working on search, so hopefully they can do something that will be productive to the effort
http://drupal.org/node/717654 (Search in D8 and beyond - basically a collection of feature requests for D8, somewhat categorized)