ApacheSolr

5/29/10 New Mercury 1.1 on Lucid Images (Beta)

Posted by Greg Coit on May 29, 2010 at 11:06pm

We're pleased to announce new AWS AMIs for Mercury 1.1 on Ubuntu Lucid. These Beta AMIs were made using the instructions posted here: http://groups.drupal.org/node/70268. AMI IDs after the jump...

Quickstart 0.5 beta released

Posted by MichaelCole on May 7, 2010 at 8:42pm

Hello,

I just created a torrent for Quickstart 0.5. New in this version:
- New features: - New Drupal: Aegir! - See ~/quickstart/aegir-install-0.4a7.sh. - Not installed by default, pending beta release... - New tools: git bazarr - New IDE: eclipse (thanks DrUbuntu for inspiration and ppa) - New PHP: imap, uploadprogress - New Apache: ssl and solr on by default - General cleanup - Downgrade php5.3 to 5.2 for Aegir and Drupal messages

Apache Solr Multilingual - Non-English and Multilingual Search

Posted by mkalkbrenner on March 31, 2010 at 2:54pm

We just released the first alpha version of Apache Solr Multilingual which supports language specific stemming, synonyms and compound word splitting. There's still a lot to do but any feedback at this early stage of the project will be helpful.

Solr in non-english

Posted by fp on December 14, 2009 at 6:49pm

I am trying to run apachesolr on a site which for now has only French content.

I have attached both the schema.xml I use and the query results from solr for a query on the word "Vidéocassettes".

From what I have gathered so far, I assumed that the following filter

charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"

Telling Solr to ignore certain patterns in indexed fields when querying them

Posted by katbailey on December 1, 2009 at 1:21am

We need to implement html node titles with bbcode (as per http://drupal.org/node/28537) for a client site that's using ApacheSolr. The titles need to be displayed in search results with their html intact so the bbcode version has to get indexed. I need to tell Solr to ignore this code when trying to match queries. For example, if a user searches for "blue smurf" (with the quotation marks), and there's a node with the title "[strong]blue[/strong] smurf" in the index, Solr needs to recognise this as a match.

Optimizing Apachesolr for non-english languages

Posted by ducdebreme on October 15, 2009 at 11:42am

I have had a lot of research about optimizing Apachesolr for non-english languages, especially for German. It comes out, that there search results can be dramatically improved by adjusting Solr's stemming and by breaking up compound words. This can be easily achieved with slight changes to Apachesolr's schema.xml.

This post is about configuring stemming:
http://www.early-dance.de/news/9188-optimizing-apachesolr-non-english-la...

And this post is about compound word splitting, that is needed in languages like German that have long combined words like "Dampfschifffahrt":
http://www.early-dance.de/news/9189-apachesolr-issues-german-and-other-g...

Project Mercury Alpha 6: Now With Solr!

Posted by joshk on October 9, 2009 at 9:50pm

I'm happy to announce the 0.6 Alpha release of the Mercury AMI, now including ApacheSolr as the search backend! This is the last piece of major infrastructure we want to integrate into the stack for scalability purposes. You can now move from a single-server install based on Mercury to a best-practice vertically scaled architecture with separate hardware to run front-end cache, application, back-end cache, search and database!

The quickest way to find it is by searching Amazon EC for "Pantheon" or "Mercury". The manifest path for the latest release (in 32bit and 64 bit flavors) is:

chapter3-storage/PANTHEON-pressflow-mercury-alpha-6.manifest.xml
chapter3-storage/PANTHEON-pressflow-mercury64-alpha-6.2.manifest.xml (back!)

If you'd like to "roll your own" we've updated the wiki instructions page with a new set of instructions for getting Solr up and running as part of the process. Feel free to improve that documentation, as it's definitely a community process.

This will likely be one of the last releases before we move the project into the Beta phase, at which point we'll be focusing on fine tuning and stability as well as portabilty onto non EC2 systems moreso than new features. If you have ideas for additional things you'd like to see integrated in the stack, please chime in. We're also going to be documenting real-world "how to" use-cases — e.g. "how do I put my existing site on Mercury" in user-friendly detail — so stay tuned for that.

As always, let us know what you think of the release, what you'd like to see in future iterations, and how your experience is in using the stack. There's plenty more to come.

Architecture question re: huge indexes

Posted by Todd Young on September 30, 2009 at 6:08pm

I intend to have a Drupal 6 site with some cutom node types indexed by ApacheSolr using the XML Schema provided. However, I'd like to add a lot of gigantic indexes to it that reside only in the Lucene/Solr system (ha) and not physically in my Drupal MySQL database. For example, I may have a thousand or even a hundred thousand nodes in Drupal, but I might have ten different "external" indexes with millions of records each, and I'd like to conduct faceted search against the whole lot of 'em.

Apachesolr and Stemming in other languages

Posted by ducdebreme on August 19, 2009 at 9:54am

When i perform searches for tools and for tool with Apachesolr, i get the same results. As far as i know, the reason is Apachesolr's stemmer, that reduces the word tools to tool and uses it for the search within the index. As far as i know, the stemmer is only aware of English stemming rules.

So how are things going, when using Apachesolr with different languages? For example, i tried the German word Kunden (plural of clients) and Kunde (singular of client) and i get different result sets. -- Obviously, the stemmer doesn't know the German stemming rules ... right?

How can the search results be improved for languages different from English? Are there German stemmers available to plug into Apachesolr?

RDF for Solr: Possible improvements

Posted by drunken monkey on July 25, 2009 at 12:42am

The Apache Solr RDF module is now in a state, where it can already, theoretically, be used. However, there is much room for improvement, so I'd like to discuss some possible ways to do this.

New groups