Integration of a No-SQL Database

Events happening in the community are now at Drupal community events on www.drupal.org.
johnmurray's picture

About Me:

My name is John Murray. I have left introductions on other posts, and you can read about me there if you are interested.

Overview:

No-SQL databases are becoming more popular, more frequently used, and are gaining more traction as cloud and distributed computing are becoming more common in everyday web development. I would like to write an API for drupal to interact and use Cassandra as a database. Short-term goals would include creating an API for use in module development.

Description:

There are currently no Non-Relational databases used by Drupal (at least implemented into the core). Cassandra is a popular and well tested, (for it's age) no-sql database that is gaining quite a bit of traction. Big companies, such as facebook (obviously), Twitter, and Digg all use Cassandra. Cassandra uses the Thrift API for interfacing with the databases, which can be easily implemented through a php interface. I would like to extend this API to be closely integrated with Drupal. The initial implementation would be through a Module that offers an API that can be used throughout the system.

The immediate benefits would be to module developers. They would have the option to choose to store informatoin in a seperate database (and some modules already give you that option, such as the civiCRM module), specefically a Cassandra Database. This could be meant for real-time or large-scale systems that are distributed or expeience very high-volumes in traffic. Using a Cassandra database would lessen the server simply because they are meant to scale in ways that relational databases cannot.

I would eventually like to integrate the API in such a way that Drupal could function using only a Cassandra database, but that is obviously much further in the future and depends greatly on the community's need (overall) of Cassandra support and the future of Cassandra development itself. However, for a CMS that takes pride in being cutting-edge, I would love to see a CMS integrate tightly with a NO-SQL datbase.

Mentors:

None as of right now, as this is a student proposal, but I am always looking for feedback/support/suggestions/etc. ! :-)

Contact Details:

  • Email: murrayj5@nku.edu
  • Phone: +1 502-442-6682

Difficulty:

Medium to Hard

Comments

chx's picture

First of all, while Drupal core does not have nosql integration, the Drupal community is certainly aware of it. The first result when googling drupal nosql contains a link to http://buytaert.net/nosql-and-sql and the third result is actually that video.

Also, there is a MongoDB integration project and at DrupalCon San Francisco, a MongoDB tak is scheduled.

Now, about Cassandra. David Strauss from Four Kitchens is working on a Cassandra API. I saw some code here https://code.launchpad.net/~davidstrauss/pressflow/cassandra-votingapi It troubles me that the second page of googling drupal cassandra has several pointers to David working on this and your proposal has no trace of this. How much work did you put into your proposal if you did not google drupal nosql nor drupal cassandra, I am wondering. Instead of submitting three proposals really quick what about doing some research, asking around in the community and so on? It's early enough to fix your proposals before submitting them to the Google app but you (and everyone else) must be aware that we can easily find out whether a proposal is sloppy (as showed here) and we will rank those accordingly.

Also, be more careful when throwing around words like "popular" and especially "easily implemented". To the contrary, Cassandra is extremely hard to work with, David is trying to cook up a PHP API which is indeed easy to work with and the hard API combined with practically zero documentation totally puts a stop in it being popular. Just because digg, twitter and facebook uses it does not mean that there are many companies that have the necessary manpower to use Cassandra.

Making Drupal 8 function without an SQL database is certainly something I want to pursue but it's an extremely tough nut to make that possible and keep it performant with the typical SQL installs.

chx, I did not say that

johnmurray's picture

chx,

I did not say that Drupal did not support any No-SQL databases. I simply mentioned that they are not integrated into the core. And yes, it would be great to see this possibly in Drupal 8. I am aware that the Drupal community is aware of No-SQL. To assume otherwise would be idiotic as the developer base for Drupal is fairly large and diverse. I was not assuming, and did not plan for it to sound as if, the Drupal community was oblivious to the No-SQL movement.

As far as Cassandra goes, I did "Google" the term cassandra drupal, drupal cassandra, drupal nosql, drupal no-sql, etc. I found some good alternatives for other databases. As far as Cassandra and David Strauss goes, I found several links pointing to one old twitter post that said " I'm very happy with my progress tonight on a friendly #Cassandra API for #Drupal and #Pressflow, but it's time for bed." I found this interesting and tried to do some research on that I really found nothing. I found a lot of entries that pointed to this link which really had no information on it other than what i can assume from the name as a voting-api that possibly uses cassandra and other than trying to dicern from the code.

Now yes, maybe I should have taken a little more time and read through some of the code to try to discern from the (undocumented) code. If this is the case that he is working on a Cassandra API (although a search of the internet doesn't reveal a lot of clear information), then I guess I should get in contact with him to work on it if he so happens to be interested in a little help.

Thanks for your comments, I will get in contact with him.

I was also looking into CouchDB for Drupal. The only work that I have found is some old work on the CouchDB Integration Module (located here: http://drupal.org/project/couchdb). The module hasn't been touched for quite some time (about 8 months or so) and from the usage stats. it looks as if about 3 people have downloaded it. (Me being one of them). However, if you know of some other work going on, I would be interested to here. Otherwise, I'm trying to get in contact with the developer to possibly make the module production ready.

--
Thanks!
John Murray

If you knew about David's work

chx's picture

... then indeed it should have been the first step to contact him. Given that his contact form is enabled on drupal.org, he is regurarly on IRC and a phone number is given on the Four Kitchens web site, that should not be hard. Because given his work, I am not sure whether we have a GSoC-worth of Cassandra work left, he will tell.

The http://drupal.org/project/mongodb MongoDB project is here. CouchDB indeed saw very little interest -- some time ago both David and I have thought it'll be a good fit for Drupal but Mongo and Cassandra are way, way better fit apparently.

Sounds good. Thank you. Do

johnmurray's picture

Sounds good. Thank you. Do you think CouchDB "seemingly" failed in the Drupal world because it was not fully developed? I would be interested in working on a No-SQL type project and CouchDB looks as if it could use some work, but like you mentioned about Cassandra, perhaps not enough for a GSoC project.

--
Thanks!
John Murray

Google Summer of Code 2010

Group organizers

Group categories

Important Announcement

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: