Hi:
I am a relative newbie and have been reading up on the different types of search available to Drupal. I am most interested in faceted search for my particular needs. Solr seems great but,as a startup, I have a tight budget so I'm trying to decide between the Lucene API Module or the Faceted Search Module, both of which also seem great, but I have a few questions....
-
Does the Lucene API module work on shared hosting? From what I've read it seems unclear, that sometimes it may, but what are the parameters that will create that possibility? Just the offerings of the shared hosting service, or are there other requirements as well?
-
From my reading & research, it seems both Lucene API and Faceted Search Modules are very good, but that the Faceted Search Module becomes unrealistic, performance-wise, if one has greater than approx 20,000 nodes, where it begins to bog down. So, for me to evaluate properly, which is best for my website, I need to know: are facets nodes?
-
Is it possible to begin with Lucene API Module, then transfer the existing database to Solr (seeing Solr runs on top of Lucene)? Or does one have to start 'from scratch' and re-write the whole database to get it into Solr?
-
Seeing the maintainer for the Faceted Search Module is now working on Solr at Acquia, does that mean there will be an application or bridge written to transfer existing Faceted Search Module databases over to Solr?
-
Can the Location Module connect with the Lucene API or Faceted Search modules? (My sense is that it can't, seeing I haven't found any info on either of the three modules that the Location Module can be connected).
Thanks, :-)
Ursula
Comments
According to the Search
According to the Search Lucene project page, it is a port of Lucene from Java to the Zend framework, so it's pure PHP ... which should run in a hosted environment.
Solr on the other hand requires a Java servlet container (it comes bundled with an example Jetty container, but it's commonly run on Tomcat, and Glassfish works with some fiddling). You'll likely have to have root access to install, configure and run a java server, so you'd need a VPS or co-located box. These can start fairly cheaply - I pay $20/mo for a VPS from Linode where I run multiple Drupal sites, as well as Glassfish for my Solr instances. The big positive for Solr is that it takes the Drupal database out of the picture for search operations. The DB is only queried to prepare content for indexing, but during searches, queries are being sent to the Solr instance. From my reading about the Lucene API module, it seems to cache previously run queries and facets, but still hits the DB for new ones. Someone please correct me if I've got the wrong impression. Where I can see an issue with this is if you embedded search results within node displays - i.e. to show a More Like This block, which could cause more frequent database hits.
I can't speak to the upgrade path (if one exists) between modules like Faceted Search, Lucene API and Solr Integration, sorry.
Have you looked at Acquia's hosted Solr search option? The biggest hassle I've found is actually getting the app container set up, and secured properly, and managing doing so for multiple Drupal sites. Acquia takes care of that for you (for a fee, of course).
Regarding facets as nodes - Solr Integration at least doesn't store facets as nodes, it packages up nodes as XML and feeds them to the Solr instance, which does the indexing. I don't know about the others. Lucene API caches results somewhere, but I haven't looked closely enough to tell you if they're nodes, custom DB tables, the cache tables or flat files. I expect that they're not nodes, and if they are, they're not actually indexed along with other nodes.
-chris
Thanks, Chris, for your great
Thanks, Chris, for your great help and info. Yes, the Apache SOLR setup & securing issue is the great hindrance for a newbie like me... :-)
Try out Acquia Search!
Acquia's hosted search is free for 30 days and quite reasonable after that. Depending on the size of your site, it could be as little as $300-$400/yr and that's including some support tickets and site monitoring.
It only takes about 15 minutes to setup, so you might want to try it to see how you like the solr functionality.
Disclosure: I helped build the Acquia Search Service
Other Disclosure: I wasn't the only one, so it's pretty awesome! :)
Thanks, Jacob, for your
Thanks, Jacob, for your suggestion. I do think Acquia is a great thing, but, to understand better, my hesitancy is that it seems the $3-400/yr is for basic support, and then the SOLR service is additional(?).
I do think that Acquia taking charge of the search aspect is a great & needed service for us Drupalers out there, and I feel it will be a viable solution for me for when my site grows (outsourcing is a good thing :-), and it's very likely I will go that route with you once things grow, but, for now, just starting out with a limited budget (and 3 sites I'm building) the $3-400 plus SOLR service (which is what I'm ultimately after) multiplied by three sites is a bit steep so I feel I need an alternate solution for the time being. As a suggestion, it may help if you offered monthly subscription vs annual lump sum payment. (Please correct me if I'm wrong and you do offer monthly subscription. :-)
Sorry for the confusion. Can
Sorry for the confusion. Can you point me to the page where you got that information?
In fact, we are giving away quite a bit of search with the basic subscription, 10 slices == 100MB.
Regarding the cost: a subscription also entitles you to some support services, site monitoring services, Mollom spam filtering and our private knowledge base.
I think @ less than $50/mo it brings a lot of value. If nothing else, a 100MB Solr search index would cost more than that much to run or more even if you knew exactly what you were doing and spent absolutely no time on it. Plus, you have the peace of mind knowing that we are taking care of the hosting so you don't have to worry about performance concerns if your site takes off.
If you have multiple sites, our sales team could possibly cut you a deal. I think they do give volume discounts at some level (perhaps 5)?
Here is reference to the actual pricing.
http://acquia.com/node/1062658
Question:
Response:
Best,
Jacob
Hi Ursula. Full disclosure, I
Hi Ursula.
Full disclosure, I maintain the Search Lucene API module, so I will try to be as unbiased as possible. SLAPI works great on shared environments, and it was actually designed to provide powerful Lucene search capabilities to that audience. It is a fully integrated solution, so it is very easy to install and requires no maintenance other that running cron. However, the trade-off for simplicity is scalability. The limit changes depending on how much data you are indexing, but it is generally a good rule of thumb not to force Search Lucene API for installations with more than 5,000 indexed nodes (this will change with elastic search + java Lucene integration in the 3.0 API). I do know that http://cmsreport.com/ has a fair amount of content and is having no problem with the scalability of the module. Also, facets are not nodes. The just provide a way to drill down your search results. Both Lucene and Solr don't use databases per-se, but they do store Lucene indexes on the filesystem. However, my best guess is that they are not transferable between the applications. Unfortunately, Search Lucene API does not integrate with the Location module. I know there is a "Localsolr Integration" module, so I would recommend taking a look at that if you are able to run Solr.
With all that being said, I am a huge fan of the Apache Solr Search Integration project, and the Acquia Search solution is an awesome, hassle-free, enterprise grade search solution. If you do opt for one of those two solutions, I am sure you would be happy with the results.
Thanks,
Chris
Hi Chris: Thanks so much for
Hi Chris:
Thanks so much for your help and clarification. SearchLuceneAPI is a very viable choice for me. I'm wondering...
The SLAPI project page states:
"Search Lucene API 3.0: 2.0 is stable! All new features will be submitted against the 3.0 API. Search Lucene API 3.0 will introduce the concept of "adapters" so administrators can choose which Lucene backend to use, for example PHP Lucene (by the Zend Framework), Java Lucene, or ElasticSearch Lucene."
Thanks,
Ursula
Just so we are clear...
Sorry for bringing this thread back to life....just wanted to clear something up that I think the OP was struggling with:
the $3-400 plus SOLR service (which is what I'm ultimately after)Just so we are clear here, Solr is free as in beer. So there are no additional costs involved with using solr over lucene. Your only costs are ensuring your environment can maintian the index of course :)
Geoffrey