Which Site Search do you use?

public
geoffb - Tue, 2008-01-08 22:56

Can anybody suggest what the best site search technique might be best for a 30,000+ article site?

We tried using the Drupal search module (indexing words, posts and users in our site's content) about 12 months or so ago. But we had problems, principally due to CCK fields from what I can remember, and so we jettisoned it in favour of Google Custom search.

Google is okay, but we'd like something more customised than they allow.

We currently run 4.7, but are looking to upgrade soon.

Any advice gratefully received.

  • Geoff

The Newest Site Search for Our Organisation

rdsmith - Wed, 2008-01-09 03:03

Boston-Area PHP Ajax Java JSP ASP C++, ETC Computer Programmer / System Admin / Web Developer

Just today I installed faceted_search (http://drupal.org/project/faceted_search) on our development site
(It is currently a Drupal 5.5 install, on a server running PHP5.2.5, and MySQL 5.0.25) at my organisation
and it really provides a refined search capability to
different types of content on the site. I read your posting above and an upgrade to Drupal 5.x might be coming at an pivotal time.
I like this module and will hopefully commit some more facets to the project (highly proficient and streamlined).
How the faceted_search works is that it teams up with the native Search module in Drupal core and adds
fine-tuned functionality to the Search functions.

Hope This Gives You Some Ideas,
Cheers,

Robert D. S.

External to Drupal

yelvington@drupal.org's picture
yelvington@drup... - Wed, 2008-01-09 15:02

We're treating search as an issue external to Drupal, as we need to index a wide assortment of content from multiple production systems. We're using Fast (which is quite expensive). We've developed a Drupal module to help Fast find and index new Drupal content. Example is here: http://search.savannahnow.com/


Sphinx

Nikolai Thyssen's picture
Nikolai Thyssen - Wed, 2008-01-09 15:55

We also use an external search engine - basically the search in Drupal core will not scale. We have been very happy with the open source solution SphinxSearch (http://www.sphinxsearch.com), that can be used as an application or as a MySQL storage engine (handy!). It's very fast and can index 150.000 articles in a matter of seconds. Example is her: http://information.dk/find


Sphinx

agentrickard@drupal.org's picture
agentrickard@dr... - Wed, 2008-01-09 16:18

I understand that NowPublic is using Sphinx as well. There is also a group dedicated to Lucene, Nutch, and Soir search engines.

http://groups.drupal.org/lucene-and-nutch

RobertDouglass is a great resource in this regard.

--
http://ken.therickards.com/
http://savannahnow.com/user/2
http://blufftontoday.com/user/3


Sphinx Search

michaelemeyers@drupal.org's picture
michaelemeyers@... - Fri, 2008-01-18 18:35

using a mysql (or any) database for intensive searching is like jamming a square peg in a round hole

we and have seen great success with http://www.sphinxsearch.com/ we use the stand alone option

we can search ~1 million records in milliseconds. it took just over one person-week to set it up, test it, and roll it out, and about the same time for research / testing and determining which search system we wanted to use. we didn't need any additional hardware either - we run the indexer on our secondary db, and searchd's on every web server

mysql and many very large torrent sites like thepiratebay.org also use it - so it'll scale to meet your needs.


writeup

moshe weitzman's picture
moshe weitzman - Fri, 2008-01-18 19:03

hi michael. i think many people would love to read a writeup or hear a session about your search at drupalcon.


Ultraseek

agaffin - Fri, 2008-01-18 19:01

We still use it, I still like it, even though it feels more and more like abandonware. You can customize it out the wazoo, both in terms of UI and back-end stuff (heck, we used to even use it to generate our RSS feeds).

Using it with Drupal meant spending some time (well, probably an hour at most), figuring out which sorts of URLs not to index, since Drupal can be absolutely wonderful at spitting out gazillions of URLs point to the same basic content, especially if you use things like forward-to-a-friend and printer-friendly.