Last updated by drunken monkey on Sat, 2009-08-15 01:18
Project information
Project page on drupal.org: ApacheSolr RDF Support
Student: Thomas Seidl (drunken monkey on d.o)
Mentor: Robert Douglass (robertDouglass)
Co-mentor: Stephane Corlosquet (scor)
Local mentor: Wolfgang Ziegler (fago)
Current status: Adding features
Description
This project will improve the ApacheSolr module by enabling it to handle (i.e., index and search with a comfortable UI) any kind of RDF data. This will instantly make it possible to provide meaningful searches for all site content that isn't node-centric, as well as content from anywhere else on the web. Only an RDF class description and a way to access the data would have to be provided (apart from the normal Solr requirements) and the module would automatically do the rest of the work.
References
- Original discussion proposing this project
- Discussion of best implementation strategies
- Discussion of possible improvements for the module
- Issue Queue
Goal and Deliverables
The result of this project will be a module (probably then added as a contrib module to the apachesolr module) which adds RDF support to the ApacheSolr integration. This should at least enable the following use cases:
- Index and consequently search non-node-related drupal data (e.g. users) with solr by using an RDF representation
- (Maybe working together with the rdf module) Upload arbitrary RDF data and let the module provide a search for it
- Let the module index and provide a search for any RDF repository
- Let multiple drupal sites all send RDF data for indexing to one Solr server and provide a multi-site search for this data
Input
Leo Sauermann adds some input: we did a rdf index of crawled data in Lucene and Sesame with the "LuceneSail". Maybe this is interesting for you. It is related to aperture, a similar thing like solr.
Project schedule
Since my semester doesn't end until the beginning of July, most of the actual coding for my project will have to be done in the last one and a half months. The first month will mostly be dedicated to planning, research and a bit of prototyping.
Also, since my project is mostly R&D, the concrete milestones could change a good deal in the course of the project.
- May 23: Community Bonding Period
Fax my Student Foreign Certification and Proof of Enrollment to GoogleCreate d.o projectRefine scope and timelineAdd other milestones
- May 31: Start Research
- June 7: Continued research
- June 14: Further research
- June 21: First Prototyping
(By this time, some of my courses will already have ended, so I should be able to begin investing more time in the project.) - June 28: Further Prototyping
- July 5: Still prototyping
- July 12: Prototyping continuing
- July 19: Don't you wish your prototyping was fun like mine?
- July 26: It gets interesting…
- August 2: Oh, rather little time left!
- August 10: Waaah, only one week to go, I'm sooo dead!
- Finish work on the module
- Develop (a) sensible simpletest(s) for it
- August 17: Wow, how did I do that?
- Clean up and comment code