Apachesolr and Stemming in other languages

ducdebreme's picture

When i perform searches for tools and for tool with Apachesolr, i get the same results. As far as i know, the reason is Apachesolr's stemmer, that reduces the word tools to tool and uses it for the search within the index. As far as i know, the stemmer is only aware of English stemming rules.

So how are things going, when using Apachesolr with different languages? For example, i tried the German word Kunden (plural of clients) and Kunde (singular of client) and i get different result sets. -- Obviously, the stemmer doesn't know the German stemming rules ... right?

How can the search results be improved for languages different from English? Are there German stemmers available to plug into Apachesolr?

Login to post comments

See

janusman - Wed, 2009-08-19 15:04

See http://drupal.org/node/463886 in the Issue queue... please contribute to the discussion!


I have some new solutions on

ducdebreme's picture
ducdebreme - Sun, 2009-10-18 15:05

I have some new solutions on this topic: visit Optimizing Apachesolr for non-english languages