Search

Let's improve Drupal's core search. With higher MySQL requirements in 6.x we should be able to do better.

Core search modules: search.module, but also node.module and user.module. Existing contrib search modules: views_fastsearch, porterstemmer, semantic search, facted_search, Apache Solr (I'll add more as people point them out to me)

Here are pending search issues that need review.

There are a few looming search indexing bugs, and some good ideas about performance, refactoring and better hooking. Please help review the above issues. My preference is to focus on improving the queries and search performance, but if you have other ideas and are willing to jump in with patches, please do so.

Search session for Drupalcon Szeged

robertDouglass's picture
public
group: Search
robertDouglass - Thu, 2008-07-03 20:05

I've proposed and extended search session for Szeged. The timing of the session is important as a way to keep momentum going from the Minnesota Search Sprint and focus on what we can achieve in Drupal 7. It will also provide a chance for people to talk about alternatives to Drupal core, what core can do to support the growing number of 3rd-party solutions, and what core can learn from things like ApacheSolr.

http://szeged2008.drupalcon.org/sessions/drupal-search-where-are-we-wher...

Please vote if you like the session.


Multiple Stemmers seem to conflict

public
group: Search
ducdebreme - Wed, 2008-06-25 07:43

We have a multilingual website providing German, English, French and Italian content on a Drupal 5 instance.
We used to use Drupal's internal search and had many issues about terms that were not found. I found, that the problems might be solved using stemming.
So i installed stemmer for all the languages: porterstemmer, de_stemmer, ...

Node Link Ranking Factor

BlakeLucchesi's picture
public
group: Search
BlakeLucchesi - Fri, 2008-06-20 01:31

Just wanted to let everyone know that I submitted a patch that allows you to rank nodes with more inbound links than other nodes higher in search results using the new ranking hook.

Take a look, give it a try, and please give feedback.

http://drupal.org/node/257216


Improving Drupal's Search Speed Under Load Conditions

techsoldaten's picture
public
techsoldaten - Wed, 2008-05-14 02:04

Trellon recently released the Xapian search module for Drupal. This replaces Drupal's standard search feature seamlessly (except for a core patch) with an interface to the Xapian search engine. There's a post about it over on Trellon.com.


Xapian Search for Drupal - Metrics and Benchmarks

techsoldaten's picture
public
groups: Enterprise · Search
techsoldaten - Wed, 2008-05-14 01:38

I posted some internal benchmarks from the Xapian module we are working on in the Trellon.com blog. This post describes the performance gains using Xapian can have and does some benchmarking on a really, really slow server.

Would love it if someone could verify those results. We are about to take the sample size of the data up to about 1mil nodes and repeat.

M


Some brainstorming notes from the sprint

robertDouglass's picture
public
group: Search
robertDouglass - Fri, 2008-05-09 21:49

Totally disorderly and mostly here for our own reference =)

  • Refactor node/user search implementations into own modules.
  • Control over the search interface.
  • Moving stuff between adv. search form and main search form and/or block.
  • Full search building interfaces. Create search environments; Each env. has own settings. eg. What content types are in search? What does the interface look like? Analog to building a view w/ fastsearch.

Drupal's Search Framework: The execution of a search

robertDouglass's picture
public
group: Search
robertDouglass - Fri, 2008-04-25 07:26

Drupal’s ambitious search module provides a framework for building searches of all kinds. By isolating the tasks involved in searching, and allowing the actual search implementations to be handled by other modules, the search framework sets the stage for all sorts of creative search applications. This article, which applies to Drupal 6, explores the structure of the search framework by following the steps needed to execute a search.

Full article: http://acquia.com/blog/drupals-search-framework-the-execution-a-search


Summer of Code 2008: Search Scoring Improvements

BlakeLucchesi's picture
public
groups: SoC 2008 · Search
BlakeLucchesi - Tue, 2008-04-22 09:45

Project Information:

Current Status: Working to complete tasks outlined below and get them ready for acceptance to D7 core. The Drupal project page can be found here: http://drupal.org/project/search_score_improvements. The project page will contain a downloadable package of all code contributed from this project.

Description:


Minnesota Search Sprint

robertDouglass's picture
public
group: Search
robertDouglass - Mon, 2008-04-21 16:14
Start: 
2008-05-09 09:00 America/Chicago - 2008-05-11 20:00 America/Chicago

Continuing the great and growing tradition of bringing people together in small groups to attack focused problems, a search related code sprint will take place from May 9-11 on the campus of the University of Minnesota. Goals of the sprint will be to enhance the search framework, identify improvements to the core search implementations, and consolidate efforts towards making Drupal 7 a fantastic release for search. Sponsors for the search sprint include University of Minnesota Libraries, Acquia, McDean, Inc. / OpenBand, Workhabit, CivicActions, The University of Michigan, Laboratoire NT2, and BoldSource. Attendees include Earnest Berry, Robert Douglass, Chad Fennell, Doug Green, Michael Hess, Djun Kim, David Lesieur, and Blake Luchessi.
Search Sprint Sponsors


Test spelling suggestions for D7

robertDouglass's picture
public
group: Search
robertDouglass - Wed, 2008-04-16 17:49

http://drupal.org/node/247482

This patch adds spelling suggestions to the page that is returned if no search results are found.
Spelling suggestions

This is done by utilizing the Levenshtein algorithm for calculating the nearness of words.


Search issue queue - dig in!

robertDouglass's picture
public
group: Search
robertDouglass - Tue, 2008-04-08 21:15

This is a list of issues in the D7 issue queue that pertain to search. They've been vetted and confirmed to be real issues. They cover the entire spectrum, from advanced feature requests to bugs to theming issues. This is a great place to start your quest to make search in D7 a better thing. The table should be re-organized a bit. We should sort it into bugs and features, and make some priority judgements. Remove rows as issues close.


Google Search Appliance Module Released!

JacobSingh@drupal.org's picture
public
group: Search
JacobSingh@drup... - Tue, 2008-04-08 19:09

Hi folks,

Please see Google Appliance Module for Drupal .

This module currently supports keymatches, keywords, i18n compat (see the module page for more info).

I will be updating it regularly with new features, as the base class is robust enough to handle an advanced search, sorting, etc. It's currently in beta because I'd like some more feedback on the architecture / integration stories before rolling out more features, but a version of this is in production on a large site.


Search Scoring Improvements

BlakeLucchesi's picture
public
groups: Search · SoC 2008
BlakeLucchesi - Mon, 2008-03-31 04:44

Background


hook_search

mihksoft's picture
public
group: Search
mihksoft - Thu, 2008-03-27 21:36

Hy!
I have a table author with author_id, author_first_name, author_last_name, author_date_of_birth, author_date_of_death, author_website, author_email, author_language_id, author_isDeleted.
I have make a module author, implement the hook_menu and make a table view for my table and a modify form. From the content I have make a new content type called Author. I have implement the hook_search in my module

function author_search($op = 'search', $keys = null)
{
switch ($op)
{
case 'name':
return t('Content');
case 'search':
$find = array();


SearchAPI Module

BlakeLucchesi's picture
public
BlakeLucchesi - Wed, 2008-03-26 19:10

The following is my first revision of a proposal to create a search API module. I'd love to get some feedback.

Project Details
A Drupal search API would allow for separation between the search interface that end users interact and the back-end indexing and retrieval work that a search engine performs. The advantages to creating a search API are:


Improving the Apache Solr Search Integration module

drunken monkey@drupal.org's picture
public
drunken monkey@... - Wed, 2008-03-26 16:47

I am planning to hand in a proposal on improving the Apache Solr Search Integration module.
The project would include:

  • Porting the module to drupal 6 (if necessary)
  • Integration in Views 2, enabling the use of Views 2 as a front-end to display the search results
  • Writing simpletest unit tests for this module, especially for the new functionality

What's your opinion on that? I have already contacted Robert Douglas to ask for his.


Refactoring core search

public
nedjo - Tue, 2008-03-25 22:30

Drupal's search APIs received some good attention at the recent Boston Drupalcon. Following up on discussions there, here is an attempt to draw together ideas on directions for refactoring core search. Please wade in and add your ideas and observations.

Existing core search

Drupal core search is implemented in an integrated way, providing a powerful working solution but little flexibility. Core search integrates several distinct pieces, among them:

  • For nodes, a custom SQL-based indexing solution.
  • For nodes, an SQL-based search algorithm.

Dynamic content view

garthee's picture
public
groups: SoC 2008 · Search
garthee - Thu, 2008-03-20 08:12

I think the way information is presented and search results are displayed can be optionally enhanced. This somewhat related to the following discussions, however a new idea
1. http://groups.drupal.org/node/9934
2. http://groups.drupal.org/node/9946

A short description of what I would love to see with this module


Sphinxsearch integration

johsw@drupal.org's picture
public
johsw@drupal.org - Fri, 2008-03-14 14:41

Following Yelvington I'm crossposting this to the following groups: SOC2008, Knight Foundation, Newspapers on Drupal and Search

One thing that would be really cool, is a module integrating http://www.sphinxsearch.com/ and Drupal. Our experience is that core search doesn't play nice when you have alot of nodes (we have 150.000+). Indexing simply kills the server.

So instead we use Sphinx. It's REALLY fast, both when searching and when indexing. BUT everytime we alter our content-types we have to manually reconfigure the sphinx configuration. This is why I propose this as a module for the SOC08 - a module that integrates Sphinxsearch and Drupal


Building a killer search for Drupal

David Lesieur@drupal.org's picture
public
David Lesieur@d... - Thu, 2008-03-06 02:20

We've had a good discussion today at Drupalcon, in a BoF session led by Robert Douglass. Here's the plan that emerged to build a killer search for Drupal that will help take us Drupalers further towards world domination. ;)


Finduser - a new search module for .... finding users

robertDouglass's picture
public
group: Search
robertDouglass - Sun, 2007-11-25 22:04

http://drupal.org/project/finduser

This is a custom search module for users. It provides a search page that can search for users by username, email, or a custom text field. The custom text field can come from any content type, thus this module plays nicely with any user-as-node strategy.

The module does one thing and does it well, and is independent of search module. It has enough customization options to allow you to bend it to your will, as well as a healthy number of themable functions.

On the administer settings page for the Find user module you have the chance to set many configuration options. If you have CCK enabled and have added any text fields to any content types, these fields will show up as options to be searched. This allows you to use a module like nodeprofile or usernode or bio to extend your user profile and make it searchable via the Find user module.


Fuzzy Search Needs Testing

BlakeLucchesi's picture
public
groups: Search · SoC 2007
BlakeLucchesi - Sun, 2007-08-19 08:00

As I am entering the home stretch of the project I have gotten a small amount of feedback from various testers regarding my fuzzy search module but so far have gotten no feedback on how good the algorithm is at returning relevant results. I have just recently reworked my algorithm to tighten up the matching. I'd really appreciate it if I could get a person with a site that has over 400+ nodes or so to test out the module and see how fast it performs and how relevant they believe the results are.


Fuzzysearch Module Initial Release

BlakeLucchesi's picture
public
groups: Search · SoC 2007
BlakeLucchesi - Thu, 2007-08-09 08:03

The Fuzzysearch module is now available for testing and feedback. The results being returned during my tests have thus far been quite good and performance has been better than search.module (overall query times and page generation times were faster with fuzzysearch.module, testing based on results displayed by devel.module).

Please try it out if you have a chance and leave me some feedback/post issues on the project page.

http://drupal.org/project/fuzzysearch

You can try out the module and read more information about the release on my blog http://boldsource.com
-Blake


Fuzzy Search Engine Major Update

BlakeLucchesi's picture
public
groups: Search · SoC 2007
BlakeLucchesi - Fri, 2007-07-27 09:09

Over the past week and a half or so I have made much progress on my project. The following are main accomplishments:


Fuzzy Search Scoring Hook (SoC Update)

BlakeLucchesi's picture
public
groups: Search · SoC 2007
BlakeLucchesi - Fri, 2007-07-13 08:58

I'm contemplating how to implement a scoring factor hook into my new fuzzy search engine module. I believe what may work out to be the best way of doing this is allowing modules to tap into this scoring hook at the time of indexing.

Each hook should return a value between 0 and 10 as a score to add to the node being indexed. Then in the administration screen an administrator would be able to set the importance of the score given by that hook. This would allow the administrator of a drupal site to manage the scoring from different contributed modules.


Fuzzy Search Engine Updates

BlakeLucchesi's picture
public
groups: Search · SoC 2007
BlakeLucchesi - Mon, 2007-07-09 05:22

Fuzzy Search Module (search_fuzzy.module)


Enhanced Search Update

BlakeLucchesi's picture
public
groups: Search · SoC 2007
BlakeLucchesi - Mon, 2007-07-02 08:20

Part 1 of my project (implementing synonym matching in the search index) is nearly completed, I am waiting for the patches to be accepted into core for drupal 6. In addition to synonym matching I also submitted a patch to index usernames with the nodes as requested in the Search group on drupal.org. The patches can be reviewed here, all comments welcome.

http://drupal.org/node/155262 - Taxonomy synonym search indexing
http://drupal.org/node/155254 - Username search indexing

For part 2 of my project I am to implement a fuzzy search engine in drupal.


Search Engine Enhancements

BlakeLucchesi's picture
public
groups: Search · SoC 2007
BlakeLucchesi - Mon, 2007-06-11 09:57

This is my first update on groups.drupal.org about my SoC Project regarding enhancing drupal's search engine. I still have 3 more days of finals so I havn't gotten to any of the coding, but after putting forth much research and thought (and some kind help from my mentor Robert Douglass), I've decided on the following 2 parts as goals for my project.

Goals for Summer of Code Search Project:


Fuzzy Techniques and Implementation

BlakeLucchesi's picture
public
group: Search
BlakeLucchesi - Fri, 2007-05-18 08:52

I've been doing my best to wrap my head around ways to make drupal more fuzzy search capable. The following are some goals of fuzzy search and I guess some comments. I'm not exactly sure how this will help as of now, but I really feel like along with improving the search engine's speed we should look at ways to provide more relevant results.


Query Optimization

douggreen's picture
public
group: Search
douggreen - Thu, 2007-05-17 11:51

6.x is moving to mysql 4.1. I think this removes the need for any support for temporary tables, but first, I'd like to confirm that 4.1 supports subselects. We also need to evaluate the pgsql options. Does pgsql support the sub-selects? Does pgsql support something similar to the ALTER IGNORE TABLE construct for adding a unique index to a table that has duplicated?

If you're just joining us, please also read: #143160 (search_index has duplicate entries) and #102088 (views_fastsearch 5.x-dev release notes)


A hook for search scoring factors?

robertDouglass's picture
public
group: Search
robertDouglass - Thu, 2007-05-17 11:36

do_search currently takes a bunch of parameters which are supposed to be sql snippets that get used in building the search queries. The only example of these being used is in node_search() where extra tables are joined to make scoring factors for things like comments count and page views, etc. My idea is to explore whether each of these scoring factors could be added in a more organized way, in do_search, by invoking a hook:

hook_scoring_factor($search, $keys)


Syndicate content