Lucene is a fabulous indexer, Nutch is a superb web crawler, and Solr can tie them together and offer world class searching. This group discusses the various projects and efforts being made to integrate these technologies with Drupal.
The ApacheSolr module integrates Drupal with the Apache Solr search platform. Solr search can be used as a replacement for core content search and boasts both extra features and better performance. Among the extra features is the ability to have faceted search on facets ranging from content author to taxonomy to arbitrary CCK fields.
Drupal projects that already provide some level of integration with Lucene and/or Nutch:
Solr the right thing or too big?
Hi there,
we are running a locale website that provides basically informations about events, businesses, classfieds, news etc. in one city.
To make it even more locale, we need to categorize everything by the districts / areas within this city.
I want the users to be able to click on:
old town
And see the three last shops, ads, events, classfieds etc. from the old town.
Clicking on 'all shops in the old town' (shops and old town should be arguments somehow) should show the shops in the old town (surprise ;) ).
At the moment I'm trying it with taxonomies, view, panels.
Read moreWildcard searching with Dismax
Hey all,
A wildcard question... I am trying to do an autocomplete text field implementation of SOLR. So, when the user starts typing in a term or a phrase, it will request SOLR results. For example, if a person types in 'car' it should return 'car', 'cars', 'cardigan'.
Now I try doing this:
<?php
function apachesolr_modify_query(){
$query->add_filter('fieldname', $value.'*');
}
?>Searchlight vs. Apache Solr Views
Currently admins wanting to add Solr to their Drupal site have two options: Apache Solr Views and Searchlight.
Both can be used to construct views that filter content (including faceted search) on the basis of Solr indexes. Both projects - particularly Searchlight - are in active development.
Read moreSolr concept quesitons
Hi,
I have a couple of questions regarding Solr, I appreciate some of these have been covered in http://groups.drupal.org/node/73583
I've attached a diagram of my Solr concept.
Basically we have a number of existing websites, all running on Drupal 6 and we will continue to add to this collection using versions 6 & 7.
I would like to provide Solr search functionality that allows searching within the current site but also provide a global search across all sites from one central portal.
Nutch links visualization
Hello,
I am interested to build a links visualization chart for my site via nutch.
I used nutch to crawl the site starting from home page and have a few segments in the segments folder.
Now I need to create a UI which shows the traversal path starting from home page that was executed by nutch with inbound and outbound links per page.
Is there any such tool already available that I can reuse.
If not, any pointers on how I should query the linkdb?
Thanks
JPK
Read moreHandling Aggregate Records/Roll-up in Solr
Can someone point me to the mechanism in Sol that might allow me to roll-up or aggregate records for display. We have many items that are similar and only want to show a representative record to the user until they select that record.
As an example – We carry a polo shirt and have 15 records that represent the individual colors for that shirt. Does the query API provide anyway to rollup the records passed on a property or do we need to just flatten the representation of the shirt in the data model.
Read moreSolr Next Gen - the 7.x-2.x refactoring
We've learned a lot about Solr and Drupal in the past three years. Much is possible with the current ApacheSolr module, but some things aren't possible, and many things aren't easy. There are quirks and limitations that reflect early design decisions which we now could solve better. To move forward and make the future a better place for Solr and Drupal, a new effort is beginning to redesign the integration from the ground up.
These are some of the high level design goals:
- Study the PECL library, learn from it, and possibly use it: http://pecl.php.net/package/solr
- Develop a query library that has improved developer usability; for example http://github.com/technosophos/SolrAPI/blob/master/solrapi.inc
- Develop (with) components that are not Drupal specific so that other open source projects can use them (see above two points)
- Take advantage of cool things like the Search API, where it makes sense: http://drupal.org/project/search_api
- Learn from efforts like Searchlight and enable them to build on shared core components: http://drupal.org/project/searchlight
- Remove node centricity, embrace the Entity API in Drupal-7, and rely on Views for as much as possible
- Make indexing more flexible and faster
- Decentralize the module structure so that more contributors can become involved in more projects
- Allow Solr to be used in more contexts than traditional search (faceted browsing, for example)
Matt Butcher will be coordinating initial planning and research, and together with input from you, will draft architecture documents to get us from where we are today to the bright and shiny future.
The project will use this group as its home. There is now a new tab/page, "Solr Next Gen", as well as a tag that you can subscribe to, so that we can track discussions. Development will happen in the 7.x-2.x branch of the apachesolr module.
Now is a good time to elaborate on the list of design goals that you'd like to see in the new architecture by commenting here.
Read moreSearching on only one SOLR field
I would like to query ApacheSOLR to return results based on one field in a node but I do not know how to do this. For example, instead of searching for the word 'dog' in a node's title, body, created date, etc, I only want to search the node title for the word 'dog'.
Can I do this using by implementing hook_apachesolr_modify_query? Any feedback on this would be much appreciated.
Read moreUsing hooks for processing all results and facets - Apachesolr 1.0
Hello,
We're trying to filter the results provided by apachesolr in the hook apachesolr_process_results (not sure if this is the right place)
Here's what i'm trying for a franchise locator
Content Type: Franchisee (bunch of fields for title, type, services provided etc + location cck)
Number of nodes: 2000
Now we're building an Advanced Search form which combines the facets for Franchise Type and a proximity search using
Zipcode + Radius.
The form captures the user input
keywords, franchise_type, zipcode + radius.
Newbie SOLR Questions
We are hoping to use SOLR in a couple of non-standard implementations, and I just have a few questions.
-
If we want to index all of the documents in a specific file directory (e.g. TIFF images of scanned documents), can we do this directly with SOLR, or do we need Nutch? (I realize TIFFS are quite legacy but in this instance conversion to PDF is not an option.)
-
If there are specific documents on a remote site we want to index with SOLR (again, specific TIFF documents), what is the best way to accomplish this. (We have the specific URLs for these documents.)
Use Profile Search
Hello,
Wondering if Apache Solr as implemented in Drupal has the ability to do user profile search. Say I wanted to allow my users to search for other user that met x, y, z criteria and indicated a preference to allow being contacted by another member of the site. Would Solr support that?
Cheers!
Read moresearching content stored in xml using Lucence
Hi
I want to implement the lucence search module into my Drupal site.
However, I have a custom module that I developed myself which pulls content from an xml file and displays it using xslt.
I was wondering if it is possible to use the Lucence search module to offer search capabilities for my xml content.
The xml content is stored in files on the server and not in a database.
Is it possible to search xml content with the Lucence module or its API?
I am also interested in any other solutions for searching xml content.
Thank you in advance for your time.
Read moreApache Solr Multilingual - Non-English and Multilingual Search
We just released the first alpha version of Apache Solr Multilingual which supports language specific stemming, synonyms and compound word splitting. There's still a lot to do but any feedback at this early stage of the project will be helpful.
Read moreSolr facet not appearing in sidebar
I have Apache Solr Search Integration 6.x-1.0-RC3. I have several taxonomies. Solr is happily searching them and I've enabled facet blocks for them. All is fine except when I try to enable a facet that wasn't enabled before (via admin/settings/apachesolr/enabled-filters) and enable the associated block (via admin/build/block/list/) and save the block change, the block doesn't appear.
Am I forgetting something? Suggestions?
Thanks.
Read moreHow to consume external search results via OpenSearch?
If I am not missing anything, it looks that OpenSearch module
allows Drupal to be an OpenSearch provider. What I would like is
to make Drupal consume search results from external systems (such as
Nutch), which also provide OpenSearch.
In other words I want Drupal to be an "importer" of search results
form external Nutch instance, rather than the "exporter".
I would appreciate if someone could shade some light on such integration. Thanks.
Read moreSolr + maps
Hi,
Has anyone managed to display the output of a solr search on a map? Using either gmap, openlayers, etc.
Also has anyone got spatial search / local search working with UK postcodes?
Been trying for a week now with no success - can get core drupal search + facets working with proximity and map output but solr is so much better!
TIA
Read moreBuilding web search engines with Drupal and Nutch
I am building a vertical web search engine. I would like to integrate Nutch with Drupal via OpenSearch Aggregator module, however the module is not compatible with Drupal 6.10+. Should I switch my search engine to Solr and use ApacheSolr module instead? Does ApacheSolr module support OpenSearch? What are my integration options?
Read moreCreating a generic Search API
The goal of this project is to build a generic Search API that will on the one hand abstract from the data source (using the entity_metadata module) — thus allowing all kinds of entities to be as easily indexed and searched as nodes —, and from the indexer / search engine on the other hand, making concrete implementations like Solr, Lucene, Xapian, … implement only the specific details and thereby eliminating unnecessary code duplication.
Also the gathered metadata and the search engine interface could be used to create a generic Views integration for all searches, thus letting all supported searches display their results as a configurable view.
The planned overall design is sketched in the attached diagram.
Creating Research Portal - 2.5M entries to index
Hello Community,
I am currently working on a non-commercial research school project. We are facing the following decision and I was wondering if you have some advice or experience in this manner.
Problem: Creation Search engine (+Community functions etc) for our research field. Additional Tagclouds, coauthoring etc to enhance the search.
Solution: We have created a MySQL DB which contains 2.5M entries (one entry is one paper). It includes the Metadata as well as the abstracts.
Read morePossible Solution | Dev Help
After seeing the post [Dev Help] http://groups.drupal.org/node/46654 it spurred me into looking at generic solutions to the problem. I have written a quick blog about it have a read and see what you think.
Comments are more than welcome
Regards,
Dave
Read more







