Posted by cilefen on December 19, 2012 at 2:29pm
Hello all:
Based on our discussion last month on IRC, I reconfigured this sandbox project as a few Nutch settings that creates an index compatible with the common schema for the apachesolr module.
http://drupal.org/sandbox/cilefen/1858412
The purpose is ad-hoc crawling and indexing, but searching within Drupal and the results are integrated with the Drupal node results.
This is for Nutch 1.x only at this stage.
Comments
Thanks for sharing! Great
Thanks for sharing! Great work.
Search API and Solr integration
Hi Chris
Thanks for publishing this body of work
Can you elaborate a little?
You changed the sandbox project to be compatable with the apachesolr module and also drupals native search?
Have you tested apachesolr module integration? Which version works?
Also. What of Search API module suite? Sarnia?
Sorry to bug you as you have given nutch in drupal its first update in 2 years...
THANKS INDEED
Wow. Apologies. You actually
Wow. Apologies. You actually documented the sandbox. % )
Will check it out.
Obvious questions around the existing Nutch module? Have you used that? Etc
Nutch 1.6 works also
congratulations for fabulous work
I have this working using the VirtualBox development environment DrupalPro running Drupal distribution of Open Outreach rc8 and I am getting results in my site search ! YAY!
am running a bigger index and testing latest Apache Solr Views module, TIKA etc
will also try with Search API
wanted to test with Panopoly also, perhaps OpenAcademy
there is also this Semantic Web / Linked Data distro just released which could be interesting
http://drupal.org/project/iksce
the obvious thing is to try the D7 version of the long forsaken Nutch module
http://drupalcode.org/project/nutch.git/tree/refs/heads/7.x-1.x
I'll shift across to the project issue queues, I'd suggest this is worthy of full project status
What would be needed to use
What would be needed to use this with Nutch 2.x?
A co-maintainer
Are you interested?
I would be interested! Solr
I would be interested!
Solr 4.4.0
Nutch 2.2.1
Latest drupal
Semms that I "kinda" got it to work - exept the URL's title displays the actualu URL, so its duplicated: "URL" - "Content Description" - "URL" instead of "Title" - "Content Description" - "Url"