5/29/10 New Mercury 1.1 on Lucid Images (Beta)
We're pleased to announce new AWS AMIs for Mercury 1.1 on Ubuntu Lucid. These Beta AMIs were made using the instructions posted here: http://groups.drupal.org/node/70268. AMI IDs after the jump...
Read moreQuickstart 0.5 beta released
Hello,
I just created a torrent for Quickstart 0.5. New in this version:
- New features:
- New Drupal: Aegir!
- See ~/quickstart/aegir-install-0.4a7.sh.
- Not installed by default, pending beta release...
- New tools: git bazarr
- New IDE: eclipse (thanks DrUbuntu for inspiration and ppa)
- New PHP: imap, uploadprogress
- New Apache: ssl and solr on by default
- General cleanup
- Downgrade php5.3 to 5.2 for Aegir and Drupal messages
Apache Solr Multilingual - Non-English and Multilingual Search
We just released the first alpha version of Apache Solr Multilingual which supports language specific stemming, synonyms and compound word splitting. There's still a lot to do but any feedback at this early stage of the project will be helpful.
Read moreSolr in non-english
I am trying to run apachesolr on a site which for now has only French content.
I have attached both the schema.xml I use and the query results from solr for a query on the word "Vidéocassettes".
From what I have gathered so far, I assumed that the following filter
charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"Read more
Telling Solr to ignore certain patterns in indexed fields when querying them
We need to implement html node titles with bbcode (as per http://drupal.org/node/28537) for a client site that's using ApacheSolr. The titles need to be displayed in search results with their html intact so the bbcode version has to get indexed. I need to tell Solr to ignore this code when trying to match queries. For example, if a user searches for "blue smurf" (with the quotation marks), and there's a node with the title "[strong]blue[/strong] smurf" in the index, Solr needs to recognise this as a match.
Read moreOptimizing Apachesolr for non-english languages
I have had a lot of research about optimizing Apachesolr for non-english languages, especially for German. It comes out, that there search results can be dramatically improved by adjusting Solr's stemming and by breaking up compound words. This can be easily achieved with slight changes to Apachesolr's schema.xml.
This post is about configuring stemming:
http://www.early-dance.de/news/9188-optimizing-apachesolr-non-english-la...
And this post is about compound word splitting, that is needed in languages like German that have long combined words like "Dampfschifffahrt":
http://www.early-dance.de/news/9189-apachesolr-issues-german-and-other-g...
Project Mercury Alpha 6: Now With Solr!
I'm happy to announce the 0.6 Alpha release of the Mercury AMI, now including ApacheSolr as the search backend! This is the last piece of major infrastructure we want to integrate into the stack for scalability purposes. You can now move from a single-server install based on Mercury to a best-practice vertically scaled architecture with separate hardware to run front-end cache, application, back-end cache, search and database!
The quickest way to find it is by searching Amazon EC for "Pantheon" or "Mercury". The manifest path for the latest release (in 32bit and 64 bit flavors) is:
chapter3-storage/PANTHEON-pressflow-mercury-alpha-6.manifest.xmlchapter3-storage/PANTHEON-pressflow-mercury64-alpha-6.2.manifest.xml(back!)
If you'd like to "roll your own" we've updated the wiki instructions page with a new set of instructions for getting Solr up and running as part of the process. Feel free to improve that documentation, as it's definitely a community process.
This will likely be one of the last releases before we move the project into the Beta phase, at which point we'll be focusing on fine tuning and stability as well as portabilty onto non EC2 systems moreso than new features. If you have ideas for additional things you'd like to see integrated in the stack, please chime in. We're also going to be documenting real-world "how to" use-cases — e.g. "how do I put my existing site on Mercury" in user-friendly detail — so stay tuned for that.
As always, let us know what you think of the release, what you'd like to see in future iterations, and how your experience is in using the stack. There's plenty more to come.
Read moreArchitecture question re: huge indexes
I intend to have a Drupal 6 site with some cutom node types indexed by ApacheSolr using the XML Schema provided. However, I'd like to add a lot of gigantic indexes to it that reside only in the Lucene/Solr system (ha) and not physically in my Drupal MySQL database. For example, I may have a thousand or even a hundred thousand nodes in Drupal, but I might have ten different "external" indexes with millions of records each, and I'd like to conduct faceted search against the whole lot of 'em.
Read moreApachesolr and Stemming in other languages
When i perform searches for tools and for tool with Apachesolr, i get the same results. As far as i know, the reason is Apachesolr's stemmer, that reduces the word tools to tool and uses it for the search within the index. As far as i know, the stemmer is only aware of English stemming rules.
So how are things going, when using Apachesolr with different languages? For example, i tried the German word Kunden (plural of clients) and Kunde (singular of client) and i get different result sets. -- Obviously, the stemmer doesn't know the German stemming rules ... right?
How can the search results be improved for languages different from English? Are there German stemmers available to plug into Apachesolr?
Read moreRDF for Solr: Possible improvements
The Apache Solr RDF module is now in a state, where it can already, theoretically, be used. However, there is much room for improvement, so I'd like to discuss some possible ways to do this.
Read more





