I'm quite busy trying to get the latest (dev) Apache_Solr module to work with our complex multi site setup and it all went fine, until we had to import an huge phpbb2 forum (28k topics, near 1 million comments - some topic has thousands of comments).
Now I'm puzzled: which route should I go through? Does anyone has succesfully implemented something like that?
From the top of my mind, I'd consider each comment as a separate document to feed SOLR with, with a rightly cooked "indexer" script that will extract them AND the topic, of course (first the topics, then the comments).
Every problem will then be delegated to the template.php file and associated tpls, where I'll get each result, test if it is a comment or a topic, and if it's a comment load the relevant topic to show it alongside the comment itself...
Issues with this approach:
1. How one would consider a comment like a "document", with a bogus $nid?
2. How could I get back the parent topic of a comment, directly from SOLR schema?
Pondering... in the meanwhile, if anyone has an idea I'll be pleased to listen :)

Comments
I tested something similar,
I tested something similar, but never (yet) run it live. I imported a forum that was slightly bigger than yours, about 5,8 million comments, 600,000+ topics and some 200,000+ users.
I used http://drupal.org/project/apachesolr basically out-of-the-box to see what happens. It indexed every node and it's comments as one document, so you ended up in the right topic when your search matched a comment.
Before I started I was, like you, thinking about indexing every comment as a separate document. But now I'm pretty sure that indexing nodes is better, because topics where more comments are about the stuff you search for get better score.
I didn't come much longer than that. Testing with this amount of data takes time :) Indexing the forum took 3 days on my developer machine.
Using apache_solr with a comment bias
I've built a system like this that is essentially a forum, where the topics are nodes with comments. I'm now wanting to give it a search facility. SOLR seems a great way to go, and apache_solr looks like it would help me get much of the way there. In an email discussion I had with Robert Douglass where I was considering implementing comments as nodes, he suggested there were small tweaks that could be done to index comments rather than nodes.
I like kajetan's point about using the relevancy one gets from clustering under a node. But most importantly, I want the results to be individual comments, both for scanning the results, and for navigation - I'd like clicking on a result to navigate specifically to that comment, perhaps using an anchor, though not sure how that works with paging.
Looking at the code, it appears fairly node-centric. I'm wanting to return specific comments rather than their nodes, and as such I'm considering building a completely new module based on this one. I'd hate to do this though, for the obvious drupally reasons. Before I dive in, I'm wondering if anyone has any thoughts on how best to extend/refactor the code to accommodate this scenario.
apachesolr with comment bias
@kcoop, sounds like both you and I have been investigating a similar revised forum system
I have been debating the same question about comments as a node or the original comment way. I've read lots of conflicting opinions on the subject, so I would love it if this thread could decide what coding method provides the best search results.
Keep in mind, that the solution will need to give the administrator the choice for the node types with comments that return the comment bias rather than the node, and then there will need to be a way for the user to choose between the comment bias or original node topic. The ability for the search to anchor to the comment within the paging is a necessity (otherwise it wouldn't be a very helpful search bias). ApacheSolr is great, so we should definitely develop a solution that uses that module/platform.
I can't offer any code advice -because this is outside my area of expertise- however I can offer to chip in some funds to assist both seeing this problem solved and code implemented.
Some progress? Just stand
Some progress?
Just stand against same issue - to provide slight user experience to search within forums and thread's post (as comments)
example solr xml working fine but my own xml files not working
I am trying to imoport xml file in solr, it is successfully importing, but it is not showing any results while sarching in solr
in solr home/example docs/ directory all example xmls are working fine but when i create a new XML file and trying to upload to solr its not flying
can any one please post the steps to import xml file in solr
Limiting solr index areas
Can we limit the areas (directories) to index on a server ?
A re-index ran all day and I think it was indexing a lot of stuff we don't care about.
ONE to MANY relation ship in solr
Hi,
Iam using solr to index sql data,Iam importing data using data-import.xml and mentioning the required fileds in schema.xml and using DIH in solr config.xml,up to now its well and good,i want to index the data for two tables which have PK and FK relationship, and i want the output of xml in the below format..i have two tables for Account and purchase history,where Account-id is PK in account table and foreign key in purchase history table,when i search using account_id it has to display all the purchase transactions in the purchase history table(one to many)i want the output as like below the parent table record should be dispalyed once and all the corresponding child table records should be repeated in an array..
101
MAHA LTD
1012
MAHA LTD comapny
sri
xxxx
sriraj
xxxx
to get the above to be done where should apply changes,
Please help me on this issue
Thanks,
Srinivas
DB| SOLR|DRUPAL
Hi,
I have started to work on joint utilization of Drupal and solr.
I have indexed my mysql database with solr and I am able to run queries on solr admin panel.
Long story short, I want to create a search page using drupal and utilizing this on records which I have indexed using solr. Could you please share the process steps or some video tutorial etc.. which will lead me to run this on drupal?
Thanks ,
Cagan
check this
check this out
http://drupal.org/project/apachesolr_comments