Adding RDF Support to the ApacheSolr module

You are viewing a wiki page. You are welcome to join the group and then edit it. Be bold!

Project information

Project page on ApacheSolr RDF Support
Student: Thomas Seidl (drunken monkey on d.o)
Mentor: Robert Douglass (robertDouglass)
Co-mentor: Stephane Corlosquet (scor)
Local mentor: Wolfgang Ziegler (fago)

Current status: Adding features


This project will improve the ApacheSolr module by enabling it to handle (i.e., index and search with a comfortable UI) any kind of RDF data. This will instantly make it possible to provide meaningful searches for all site content that isn't node-centric, as well as content from anywhere else on the web. Only an RDF class description and a way to access the data would have to be provided (apart from the normal Solr requirements) and the module would automatically do the rest of the work.


Goal and Deliverables

The result of this project will be a module (probably then added as a contrib module to the apachesolr module) which adds RDF support to the ApacheSolr integration. This should at least enable the following use cases:

  • Index and consequently search non-node-related drupal data (e.g. users) with solr by using an RDF representation
  • (Maybe working together with the rdf module) Upload arbitrary RDF data and let the module provide a search for it
  • Let the module index and provide a search for any RDF repository
  • Let multiple drupal sites all send RDF data for indexing to one Solr server and provide a multi-site search for this data

Leo Sauermann adds some input: we did a rdf index of crawled data in Lucene and Sesame with the "LuceneSail". Maybe this is interesting for you. It is related to aperture, a similar thing like solr.

Project schedule

Since my semester doesn't end until the beginning of July, most of the actual coding for my project will have to be done in the last one and a half months. The first month will mostly be dedicated to planning, research and a bit of prototyping.
Also, since my project is mostly R&D, the concrete milestones could change a good deal in the course of the project.