Lucene, Nutch and Solr

Lucene is a fabulous indexer, Nutch is a superb web crawler, and Solr can tie them together and offer world class searching. This group discusses the various projects and efforts being made to integrate these technologies with Drupal.

The ApacheSolr module integrates Drupal with the Apache Solr search platform. Solr search can be used as a replacement for core content search and boasts both extra features and better performance. Among the extra features is the ability to have faceted search on facets ranging from content author to taxonomy to arbitrary CCK fields.

Drupal projects that already provide some level of integration with Lucene and/or Nutch:

Nick_vh's picture

Help us fund the port of Search API Drupal 8

Feedback wanted here :
https://www.drupalfund.us/project/help-us-fund-port-search-api-drupal-8

Pledges would also be really appreciated!

Introduction

Search API in Drupal 8 had a massive interest during Drupal Dev Days in Szeged, Hungary. Now this week ended (or almost ended at the time of this writing) we are looking to organise at least 2 more sprints to get Search API in a stable state for Drupal 8.

With the news that the Apache Solr module & Search API Solr module will merge, this sprint is beneficial for everyone that is using search on their website.

Read more
alexus's picture

epoch time (DEC 31 1969)

Whenever I input a standard search inside of a search box, majority (if not all) returned results appears in epoch time (DEC 31 1969), needless to say we don't have this date inside of our content.

our environment is using: php-5.3.3 (RHEL6), apache-solr-1.4.1 and apachesolr-6.x-2.0-beta2.

drush -r /var/www/html/current/ -l XXX.XXX.XXX pm-list | grep -iE '(?)solr(?)'

Apache Solr Apache Solr comment search Module Not installed 6.x-2.0-beta2

Read more
niccolo's picture

Big Data Drupal: Cloudera Hadoop, MapReduce, Nutch, Solr, Aegir BOA, Drupal 7 ApacheSolr Views

I am giving a talk at Badcamp on Big Data Drupal: Cloudera Hadoop, MapReduce, Nutch, Solr, Aegir BOA, Drupal 7 ApacheSolr Views

http://2013.badcamp.net/sessions/big-data-drupal-cloudera-hadoop-mapredu...

I am trying to gather some other experts i.e. Cloudera / Hadoop / MapReduce + HyperDrupal + Twig etc to come and handle the bigger and deeper questions

https://drupal.org/node/2104503
https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/Uwuj1q7bWBY

Read more
Nick_vh's picture

(Solr) Search in Drupal 8 - Part 2

Drupalcon : Further Discussion on architecture (27 Sept 2013)

This discussion is purely on what we could do with Search API to solve these issues. The previous discussion (See : https://groups.drupal.org/node/327723 ) was more about what both projects could share.

Read more
Nick_vh's picture

(Solr) Search in Drupal 8

On Tuesday, 24th of September we had a birds-of-a-feather session about Solr in Drupal 8 and there was an attendance of around 8-10 people. Also all the module maintainers were present in this meeting so it was perfect to have a productive meeting, which it also was.

We discussed the following topics

Changes in Drupal 8 core to the search module in core.
Possibilities to have a generic connector for Solr.
Query Class
Document Class
Indexing logic
Schema files
Views query plugin
Indexing logic and why queue is not a good option for indexing items to your search

Read more
polx's picture

Configure sort-fields?

Hello SOLR Drupal experts,

is there a way to configure the fields used in sorting?
I am happy to write the schema and config on the solr side to do so (such as "sInt") but I am unsure how to do so on the Drupal side.

thanks in advance.

Paul

PS: what is the best activity time for #drupal-apachesolr ?

cilefen's picture

Solr Nutch Sandbox Modified to a simple search interface

Dear Lucene, Nutch, and Solr and my NJ Colleagues:

The Solr Nutch project has been changed to a simple search interface that does one thing--search Solr indexes that were crawled by Nutch, using Nutch's preferred schema. Please give it a try if you use Nutch.

http://drupal.org/sandbox/cilefen/1858412

Nick_vh's picture

Drupal Search and Solr office hours #6

Start: 
2013-04-24 16:00 - 17:00 UTC

Drupal Search has a great ecosystem of modules to integrate with technologies such as Solr. However, it needs more vision and direction to grow and be a great platform where other developers feel comfortable with and are able to make the right decisions. Also We are convinced that if we all come together and talk, get some decisions and actually get to work on a regular basis we can come up with a solution for Drupal that kick a**!

Read more
Nick_vh's picture

Drupal Search and Solr office hours #5

Start: 
2013-04-10 16:00 - 17:00 UTC

Drupal Search has a great ecosystem of modules to integrate with technologies such as Solr. However, it needs more vision and direction to grow and be a great platform where other developers feel comfortable with and are able to make the right decisions. Also We are convinced that if we all come together and talk, get some decisions and actually get to work on a regular basis we can come up with a solution for Drupal that kick a**!

Read more
cdykstra's picture

Solr project

I have a project that I've used Drupal 7, Ubercart 3 on and has already launched. After launch, client informs me of their specialized search needs for products.

The use case is all results matching a string of 3 or more consecutive characters, including a mix of upper case, lower case, numbers and punctuation ('tet' should return anything with tetanus, rgTet9-66, etc) from sku and title field.

Read more
smartcoder's picture

Easy Integration on SOLR-Nutch-Drupal

Hello,

Can anybody point me to a step-by-step guide on combining SOLR-NUTCH-DRUPAL together?

What I am trying to achieve is crawl some data from various websites and create a comparison platform for them. Any help will be much appreciated.

Many Thanks in advance.

Shashank

Nick_vh's picture

Drupal Search and Solr office hours

Start: 
2013-03-27 16:00 - 17:00 UTC
Organizers: 

Drupal Search has a great ecosystem of modules to integrate with technologies such as Solr. However, it needs more vision and direction to grow and be a great platform where other developers feel comfortable with and are able to make the right decisions. Also We are convinced that if we all come together and talk, get some decisions and actually get to work on a regular basis we can come up with a solution for Drupal that kick a**!

Read more
Nick_vh's picture

Drupal Search and Solr office hours

Start: 
2013-03-13 16:00 - 17:00 UTC
Organizers: 

Drupal Search has a great ecosystem of modules to integrate with technologies such as Solr. However, it needs more vision and direction to grow and be a great platform where other developers feel comfortable with and are able to make the right decisions. Also We are convinced that if we all come together and talk, get some decisions and actually get to work on a regular basis we can come up with a solution for Drupal that kick a**!

Read more
niccolo's picture

Big Data Drupal with Cloudera, Hadoop, MapReduce, Nutch and Solr

thanks to the recent work of the Solr Nutch sandbox project I've managed to get Nutch 1.6 jobs to run on a Cloudera CDH3 4 node cluster sending results to Solr 3.6.2 (hosted within Tomcat on Aegir BOA) and then integrated into the Apache Solr 7.1.1 module (not the dev) into search results and Apache Solr Views

I must say, I am pretty excited about Hadoop / Cloudera running Nutch and Solr and integrating with Drupal

for anyone interested in setting up a Cloudera cluster I recommend masterschema (centos) and Gregory Grubbs on YouTube (debian)

I'll post some notes etc ASAP

Read more
alanom's picture

Best approach to indexing stemmed and unstemmed fulltext in Drupal?

A common desire with Apache Solr search servers is to get the "best of both" stemming and not stemming terms, indexing both the original term and the stem with something like SnowballPorterFilterFactory. Stemming matches grammatical variations, while indexing the original boosts exact matches to rank higher than near matches, and protects against awkward cases where after stemming, the original term no longer matches.

Read more
cilefen's picture

Solr Nutch Search Sandbox Project Updated to Integrate with Common Schema

Hello all:

Based on our discussion last month on IRC, I reconfigured this sandbox project as a few Nutch settings that creates an index compatible with the common schema for the apachesolr module.

http://drupal.org/sandbox/cilefen/1858412

The purpose is ad-hoc crawling and indexing, but searching within Drupal and the results are integrated with the Drupal node results.

This is for Nutch 1.x only at this stage.

cilefen's picture

Solr Nutch Search Sandbox Project Added

Hi All,

I just added "Solr Nutch Search", a sandbox project.

http://drupal.org/sandbox/cilefen/1858412

I welcome your feedback. Let me know if it is good enough for a full project, in which case I could use a co-maintainer.

-Chris McCafferty

Nick_vh's picture

Drupal Search and Solr office hours

Start: 
2012-12-05 16:00 - 17:00 UTC

Drupal Search has a great ecosystem of modules to integrate with technologies such as Solr. However, it needs more vision and direction to grow and be a great platform where other developers feel comfortable with and are able to make the right decisions. Also We are convinced that if we all come together and talk, get some decisions and actually get to work on a regular basis we can come up with a solution for Drupal that kick a**!

Read more
gaurav-varshney's picture

Searchapi integration with searchapi solr.

i am using searchapi module and serachapi solr module. i have set up solr successfully.
i have setup 2 different instance in one server for two different content types(A,B);
and indexes are created using it.

Read more
niccolo's picture

Nutch 2.1, Solr 4.0 etc

the latest version of Nutch 2.1 seems to work quite nicely with Solr 4.0 and am wondering if others have tried sending results to Search API and / or Apache Solr Search Drupal modules ?

there are lots of possibilities with integrating web-crawls into Drupal views, searches etc

Nutch 2.1 / Solr 4.0 (Gora+Mysql) running using this tutorial
http://nlp.solutions.asia/?p=180

Nutch 2.1 + Aegir BOA?
http://drupal.org/node/1851318

Drupal Nutch module and 2.1?
http://drupal.org/node/1851324

Drupal Elastic Search module and Nutch
http://drupal.org/node/1851064

Read more
Subscribe with RSS Syndicate content

Lucene, Nutch and Solr

Group organizers

Group categories

Projects

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week