Can anybody point me to a step-by-step guide on combining SOLR-NUTCH-DRUPAL together?
What I am trying to achieve is crawl some data from various websites and create a comparison platform for them. Any help will be much appreciated.
Many Thanks in advance.
thanks to the recent work of the Solr Nutch sandbox project I've managed to get Nutch 1.6 jobs to run on a Cloudera CDH3 4 node cluster sending results to Solr 3.6.2 (hosted within Tomcat on Aegir BOA) and then integrated into the Apache Solr 7.1.1 module (not the dev) into search results and Apache Solr Views
I must say, I am pretty excited about Hadoop / Cloudera running Nutch and Solr and integrating with Drupal
for anyone interested in setting up a Cloudera cluster I recommend masterschema (centos) and Gregory Grubbs on YouTube (debian)
I'll post some notes etc ASAPRead more
I just added "Solr Nutch Search", a sandbox project.
I welcome your feedback. Let me know if it is good enough for a full project, in which case I could use a co-maintainer.
the latest version of Nutch 2.1 seems to work quite nicely with Solr 4.0 and am wondering if others have tried sending results to Search API and / or Apache Solr Search Drupal modules ?
there are lots of possibilities with integrating web-crawls into Drupal views, searches etc
Nutch 2.1 / Solr 4.0 (Gora+Mysql) running using this tutorial
Nutch 2.1 + Aegir BOA?
Drupal Nutch module and 2.1?
Drupal Elastic Search module and Nutch
HeadFirst are a web development company based in Wellington, New Zealand. We provide enterprise-grade Drupal-based solutions for public sector, corporate and select start-ups. We are looking for a full time Senior Developer/Technical Lead proficient with PHP and Drupal. This is a hands on position - you will be responsible for leading a team of 3-4 developers building Drupal solutions for our clients.Read more
my presentation on OpenScholar at Merritt College, Oakland has been accepted for the Stanford DrupalCamp
any suggestions for topics, formats or co-presenters welcome
I have a fairly complex url filter that needs to be created and I just don't have the time to figure it out. I am looking for a someone who can develop a good urlfilter for one specific site.
It's a very small task, I know. But, it needs to be done right and I can learn a lot from getting one done professionally.
1 URL Filter for 1 specific site.
The URL filter I can imagine is fairly complex. (At least in my eyes)
Payment is per agreement.
We are hoping to use SOLR in a couple of non-standard implementations, and I just have a few questions.
If we want to index all of the documents in a specific file directory (e.g. TIFF images of scanned documents), can we do this directly with SOLR, or do we need Nutch? (I realize TIFFS are quite legacy but in this instance conversion to PDF is not an option.)
If there are specific documents on a remote site we want to index with SOLR (again, specific TIFF documents), what is the best way to accomplish this. (We have the specific URLs for these documents.)
Senior Web solutions architect required!
Based in Reading, Berkshire - a 5 minute walk from Reading station and just a 25 minute commute from Paddington London, this is a great opportunity to work with other experts on this large award winning Drupal site.
Are you an expert in designing scalable web solutions using most of the following technologies:
PHP PYTHON MySQL LUCENE SOLR APACHE TOMCAT MEMCACHED
Do you have experience with: Content indexing, Clustering, Taxonomies and Ontologies.
Are you passionate about open source?Read more