Lucene, Nutch and Solr
Lucene is a fabulous indexer, and Nutch is a superb web crawler. Together, they can make the basis of a full-featured search engine. Lucene alone can be used to provide search services, such as those needed by a Drupal site. This group discusses the various projects and efforts being made to integrate these technologies with Drupal.
The ApacheSolr module integrates Drupal with the Apache Solr search platform. Solr search can be used as a replacement for core content search and boasts both extra features and better performance. Among the extra features is the ability to have faceted search on facets ranging from content author to taxonomy to arbitrary CCK fields.
Drupal projects that already provide some level of integration with Lucene and/or Nutch:
How to display all results with facets
I've been trying to figure out how to display a page that shows all of the results, a la what you see when you click "modules" at drupal.org. Can anyone point me in the right direction?
Optimizing Apachesolr for non-english languages
I have had a lot of research about optimizing Apachesolr for non-english languages, especially for German. It comes out, that there search results can be dramatically improved by adjusting Solr's stemming and by breaking up compound words. This can be easily achieved with slight changes to Apachesolr's schema.xml.
This post is about configuring stemming:
http://www.early-dance.de/en/news/9188-optimizing-apachesolr-non-english...
And this post is about compound word splitting, that is needed in languages like German that have long combined words like "Dampfschifffahrt":
http://www.early-dance.de/en/news/9189-apachesolr-issues-german-and-othe...
Architecture question re: huge indexes
I intend to have a Drupal 6 site with some cutom node types indexed by ApacheSolr using the XML Schema provided. However, I'd like to add a lot of gigantic indexes to it that reside only in the Lucene/Solr system (ha) and not physically in my Drupal MySQL database. For example, I may have a thousand or even a hundred thousand nodes in Drupal, but I might have ten different "external" indexes with millions of records each, and I'd like to conduct faceted search against the whole lot of 'em.
Trying to adapt "project_solr module for D6" and needing some help
I have this params array:
$params = array(
'fl' => 'id,nid,title,body,format,comment_count,type,created,changed,score,url,uid,name,sis_project_release_usage,ds_project_latest_release,ds_project_latest_activity',
'rows' => variable_get('apachesolr_rows', 10),
'facet' => 'true',
'facet.mincount' => 1,
'facet.sort' => 'true',
'facet.field' => array(
'im_vid_'. _project_get_vid(),
'im_project_release_api_tids',
),
'facet.limit' => 200,
'sort' => $query->solrsort,
);
Search URL
I am trying to integrate lucene api search to my site. But have run into an issue with my ad servers. The URL is example.com/search/luceneapi_node/{querystring} . The problem is that "luceneapi_node" in the url generates ads that are related to lucene and open source. This is a business journal magazine site with ads coming from various feeds and networks. So getting rid of or changing the url path is ideal here.
How do I change the url path or create an apache redirects to go to an ideal "/search/site/{querystring}" or "/search/qry/{querystring}"
Apache Solr API discussion
Resources
Nitpicks
- Dependency on search module
- Do we need keywords as the path? Or can they just be GET parameters?
- $params is GOOD because it is totally openly hackable
- Do we want to create a set of new classes and each handles a type of search? Subclasses of a generic class?
- Do we need to replace or rewrite the PHP library?
- Is there a way to add facets without a custom module
- Facets tend to just be displayed as lists.
Multi language search BoF session at DrupalCon Paris
Dear Drupalistas attending DrupalCon Paris,
Anyone who needs to be able to search content on a multi language site (via Apache Solr or Drupal's built-in search), please join the discussion at the BoF session
http://paris2009.drupalcon.org/session/multi-language-search
The session takes place on Thu 3 Sept 2009, 10:00 - 11:00, Rockefeller room (during second keynote session).
Please let us know if you intend to come by clicking the signup button. Thanks!
Apachesolr and Stemming in other languages
When i perform searches for tools and for tool with Apachesolr, i get the same results. As far as i know, the reason is Apachesolr's stemmer, that reduces the word tools to tool and uses it for the search within the index. As far as i know, the stemmer is only aware of English stemming rules.
So how are things going, when using Apachesolr with different languages? For example, i tried the German word Kunden (plural of clients) and Kunde (singular of client) and i get different result sets. -- Obviously, the stemmer doesn't know the German stemming rules ... right?
How can the search results be improved for languages different from English? Are there German stemmers available to plug into Apachesolr?
August Northern Virginia Drupal Meetup - featuring Chuck D'Antonio from Acquia
Come join us for a presentation and discussion of Solr. We will talk about why its better than core Drupal search, how it improves the user experience and see it in action.
And, we're very excited that Chuck D'Antonio, Acquia's Senior Director of Professional Services will discuss Acquia Search.
Location:
RHODESIDE GRILL - downstairs
1836 Wilson Blvd
Arlington, VA 22201
(703) 243-0145
RDF for Solr: Possible improvements
The Apache Solr RDF module is now in a state, where it can already, theoretically, be used. However, there is much room for improvement, so I'd like to discuss some possible ways to do this.
Chat Module
Hi Guys,
Can anyone recommend me a good chat solution for Drupal 6. I am trying to evaluate different solutions like DimDim, 123FlashChat Server, avchat(avchat.com)..
Has anyone used any of these solutions.. Will U recommend any?
I will be using this for a Learning Management solution. I would like to host online presentations, do whiteboarding, have the ability to control the time the chat sessions are on; All chat sessions will need to be stored and indexed for later retrieval. I will be using the apache solr module. That part seems to work fine.
How to implement apache solr for multiple sites
Hi All,
I want to know we can implement single apache solr instance for multiple sites, what all changes required to done...???
Jatinder
RDF for Solr: Possible implementation strategies
(For information about my project, see here. Put shortly, it's about enabling Solr to index RDF data via drupal.)
Before starting the actual coding, even on prototypes, the basic options for implementing this will have to be discussed. At the moment, my mentors and I see the following three possibilities:
Double click ad server
Hi All,
After adding double-click ad server for ads into my site, its page load time increase and performace is slower then earlier...
I want to know how to increase/optimze the page load performace, when i am using double click ad server.
Best,
Jatinder Cheema
Single Apache-Solr for multiple sites
Hi All,
I want help in configuring single apache-solr-nighty engine for multiple-sites with different languages.
Please help me in configuring the apache-solr for multiple-sites.
Jatinder
Adding RDF Support to the ApacheSolr module
Project information
Project page on drupal.org: ApacheSolr RDF Support
Student: Thomas Seidl (drunken monkey on d.o)
Mentor: Robert Douglass (robertDouglass)
Co-mentor: Stephane Corlosquet (scor)
Local mentor: Wolfgang Ziegler (fago)
Current status: Adding features
Description
This project will improve the ApacheSolr module by enabling it to handle (i.e., index and search with a comfortable UI) any kind of RDF data. This will instantly make it possible to provide meaningful searches for all site content that isn't node-centric, as well as content from anywhere else on the web. Only an RDF class description and a way to access the data would have to be provided (apart from the normal Solr requirements) and the module would automatically do the rest of the work.
Anyone working on the nutch module for 6?
Hi there,
Anyone working on getting the nutch module working for Drupal 6? Any folks know of other avenues to get full-text document search (.pdf, .doc, etc.) in Drupal 6?
Thanks!
Multisite Search using ApacheSolr module
Hi,
Can anyone let me know if it is possible to index and search multiple Drupal and non-drupal websites using the ApacheSolr module?
If not please let me know of any other way that this could be achieved.
Thanks
Problem while implementing Lucene
Hi all,
I am facing the problem while implementing the lucene search in my site. Can any body help me out for the problem. Even i added teh Zend Framwork for Lucene search in drupal site but still i am facing the following error while adding the module of search lucene API.
"The required Zend Framework components of Search Lucene API are not installed. (Currently using Zend Framework components not installed) "
Please comment with your solution and with site from where i can get the components of search lucene API.
Solr RDF Support
Overview
This project is about adding RDF Support to the popular ApacheSolr module in the form of a Solr RDF contrib module. The module should be able to read an RDF class specification and automatically generate the necessary mapping to a Solr server, provide the capability to search resources with that type and also generate facets based on its properties. It would even be possible to build the existing Node search capabilites completely on top of this mechanism! But in any case you could also add arbitrary other types like users or taxonomy terms, or resources from other websites altogether.








