Search module

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
jhodgdon's picture

I'm working on a site for a client, and they want to have a site-wide search function. So I tried using the Drupal core Search module. Two problems:

(a) Several of the pages on the site are views. It appears that Search doesn't index their header/footer text, which means for instance that if you search for "doctor list", the "Doctor List" view doesn't come up in search results. Also, the database-generated parts of views aren't indexed as being part of the view page, so for instance if you search on a particular doctor's name that appears on the doctor list page, you'll get the individual doctor node but not the view that also displays it.

(b) Some of the content of the site is not really meant to be displayed as nodes, but they come up in searches. I think I can use the "Restricted Search" module http://drupal.org/project/search_block to get around this problem, but I haven't tested it -- any other solutions people have used?

This has to be a fairly common issue... Are there more useful search modules out there, that would search the generated content of pages that are actually in the navigation, for instance? I looked on drupal.org in the "Search" category, but I didn't see anything... Not sure how easy it would be to write. Thoughts?

Thanks,
Jennifer

Comments

can't help with (a)

ksenzee's picture

Problem (a) is tricky. Those headers and footers always seem like views' unwanted stepchildren. It would be interesting to know how the apachesolr module handles them, especially since Acquia's coming out with a hosted version soon.

For problem (b), I've used the search_type module, which worked fine, but Restricted Search looks like it might be a better option.

theme based solution

jkopel's picture

I have dealt with both of these issues (in D5) through clumsy but effective theme templates.

for (a) I make nodes (of some simple type) for each header or footer (instead of putting them in the view). I then make a theme template for each view which includes the the appropriate header/footer nodes and displays them. Finally, I add url redirects which send any request for a header/footer node to the appropriate view page.

It hurts, but it works. a search will return the header/footer node, but clicking it will redirect to the view.

for (b) you need to write a theme_search_item() function for your template.php.
[http://api.drupal.org/api/function/theme_search_item/5]

I use something like:

function theme_search_item($item, $type) {
  switch(strtolower($item['type'])){
     case 'article':
     // return some formatted output
     break;
     case 'page';
     // return some formatted output
     break;
   }
}

Then it only returns results for the included types and any other results are ignored.

Web developer @ tableau

Hmmm...

jhodgdon's picture

Thanks for those suggestions...

I guess for getting search to show the view header/footer, I can also embed a view (block) on a regular page (node), especially if it's mostly header and not much footer. jkopel's suggestion is very clever, somewhat of a hack, but clever. But it would be at least as relevant/important for search to find the view's content (fields, node list, whatever), which I guess it won't without writing a module of some sort... it looks like there's an "update_index" hook in search_cron... That could be very useful -- maybe I'll try it out and offer it as a patch for Views. Looks like an interesting project.

I also like jkopel's suggestion for (b) -- maybe for I can return a link to a view for some content types, since the views on this site tend to be 'show me a list of all the x content" types of pages, rather than subsets. That might actually work, not to mention being a simpler project.

search via google custom search

msteudel's picture

I have a search issue which is less drupal centric but perhaps it's an option.

had the following issue:
* I have a site that has a lot of non drupal pages as part of the site (aka content not indexable by drupal)

Since I have this hybrid issue, I figured I needed to use a search engine outside of drupal. That's where google's paid site search comes in.
http://www.google.com/sitesearch/

I'm having our client sign up for it's only 100/year so it's fairly cheap. I can have google's search api do all the cool spell check suggestions and technology to search our site. The paid search also allows me to remove all the ads and branding. I havne't figured out if I'm going to use their AJAX api or their XML api to then customize the search results yet.

XML API info: http://www.google.com/sitesearch/#xml
AJAX API: http://code.google.com/apis/ajaxsearch/

There is a google custom search module that I haven't checked out yet.
http://drupal.org/project/google_cse

It seems to me since google search just looks at the page it would include the footers and headers that you haven't gotten indexed via drupal. But I also know that it adds a bit more cost to a project and it's less drupal centric than other suggestions.

Mark

Yes, that might be best...

jhodgdon's picture

We're exploring Google search as an option too. It looks like it would be $100/year for their customized "Site Search", which isn't much for this particular client.

XML Sitemap

jdwalling's picture

It looks like XML Sitemap would be a good enabler for Google Site Search:
http://drupal.org/handbook/modules/gsitemap
http://drupal.org/project/xmlsitemap
6.x is still in development

Took a look at the CSE module

jhodgdon's picture

I took a quick look at the Google Custom Search Engine module (http://drupal.org/project/google_cse).

It doesn't appear to me that it buys you much. When you sign up for Google Custom Search, they give you a wizard page you can use to generate the search form; you can easily paste the HTML into a custom block and you are done. This module allows you to do the same thing from within Drupal, but I'm not sure why it's necessary or even why it's a good idea to bloat with another module when a small block is just as easy.

I guess if you want to use the JavaScript/iFrame version of search, the module might save you some time over using the wizards from Google. But that version of search doesn't have any graceful failing for people who don't have JS enabled, so it means only JS users can search your site. Lame (and anti ADA, as far as I know).

So what I will probably do is get my client on the paid version of Google Custom Search, which allows you to get results via XML, and write a Drupal module that will collect and display the XML results within the site. Assuming I do that, I'll get the module out to the public... I thought the google_cse module was going to do that, but it doesn't. Sigh.

Mark - if you are writing a Drupal module to do the XML display, let's combine efforts...

XML documentation

jhodgdon's picture

The documentation for the XML request/results for Google Custom Search:
http://www.google.com/coop/docs/cse/resultsxml.html
Their schema is weird, but it looks like it wouldn't be too hard to parse and display.

I can't believe no one has done this before, but I can't find anything...

would be happy to

msteudel's picture

Hey Jennifer, I haven't gotten the time yet to sit down and figure out this part of the project, but if we are both going to need this functionality for our projects I'd be happy to combine efforts. Obviously you have a lot more experience doing this sort of thing in the context of drupal, so I'll take your lead, just let me know how you want to go about this. We can take this offline too if that's more appropriate.

Mark

First pass done...

jhodgdon's picture

I got this working for my client's site (that is, using paid Google Custom Search, getting the results via XML, and displaying it within the site via a Drupal module). If anyone wants to test it out (you would need to have a paid Google Custom Search account), let me know.

I did it as a patch to the Google Custom Search Engine module (http://drupal.org/project/google_cse) mentioned above, since it fits in there well. I will be posting it to their issue queue soon, but if anyone wants to test in the meantime...

I tried this out works great

msteudel's picture

Hey Jennifer,

I tried this module out finally and it works great on the site. It was very easy to setup and get going. I had a question regarding how the results are being formatted, primarily the excerpt. Looking at the html spit out, there's a line break after a certain point. Do you know where this is controlled? In the google controlpanel? Somewhere in the module?

Thanks, Mark

There's a theme function...

jhodgdon's picture

The module calls theme( 'google_cse_search_result_items', $results ) to format the results. The module has a sample function called theme_google_cse_search_result_items(), which you can override in your theme if you want to change how it is done.

It looks like the default function puts out three P tags with classes -- one for the title, one for the excerpt (which comes directly from Google), and one for the URL. It pretty much mimics what the Google search result page would look like, except it goes inside your site. I think the default function is pretty well documented if you want to override it...

I was considering posting

jhodgdon's picture

I was considering posting this question to the Drupal forums, and did another search on drupal.org first. I found this issue filed on Views, which basically said "you nutcase, of course you can't get Drupal search to index views the way you would hope": http://drupal.org/node/281056

So I guess I am a nutcase for wanting Drupal site search to actually search my client's site in a reasonable way.

I'm glad that our local DUG is more polite in responding to my query than whoever wrote that response to what was a perfectly reasonable question in the Views issue queue... :)

hmm looks like it's coming in from the xml

msteudel's picture

Hmmm ... ok I traced it down to the data coming in from google. Looks like they are putting in a
tag half way through their exceprt fields. Did you play around at all with formatting the search results from google at all?

No...

jhodgdon's picture

I didn't try anything with the info coming from Google, because it suited the purposes of the site I was working on, and I also thought it was a good default for the module. You could probably use the PHP function strip_tags to help with your issue.

By the way, I haven't seen any action from the module maintainers on my proposed patch/addition. It might help if you replied to the ticket and said something like "any chance this will get into the module" or something like that. :)

Will do!

msteudel's picture

Thanks will do.

i can't find it

msteudel's picture

Hey Jennifer, I went looking for this to bump it but I couldn't find it in pending patches or issues queue, am I blind?

Here's the issue...

jhodgdon's picture

http://drupal.org/node/348311

An attachment on that issue contains the revised module I created.

Development & Infrastructure

sahasra's picture

i have same problem