Apachesolr and Stemming in other languages

Events happening in the community are now at Drupal community events on www.drupal.org.
ducdebreme's picture

When i perform searches for tools and for tool with Apachesolr, i get the same results. As far as i know, the reason is Apachesolr's stemmer, that reduces the word tools to tool and uses it for the search within the index. As far as i know, the stemmer is only aware of English stemming rules.

So how are things going, when using Apachesolr with different languages? For example, i tried the German word Kunden (plural of clients) and Kunde (singular of client) and i get different result sets. -- Obviously, the stemmer doesn't know the German stemming rules ... right?

How can the search results be improved for languages different from English? Are there German stemmers available to plug into Apachesolr?

Comments

See

janusman's picture

See http://drupal.org/node/463886 in the Issue queue... please contribute to the discussion!

I have some new solutions on

ducdebreme's picture

I have some new solutions on this topic: visit Optimizing Apachesolr for non-english languages

Lucene, Nutch and Solr

Group organizers

Group categories

Projects

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week