Non-node data visualization in Drupal?

aendrew's picture

So, Drupal's great for visualizing nodes — you create a View, grok your data and boom, you're done. But what if you have a massive data source that would create a thousand — ten thousand, a hundred thousand, a million! — nodes if you imported it as-is?

I've used Data API previously and it's really cool — you tell it to pay attention to a random table and it creates a View of it and you can do a lot of different things. Similarly, Sheetnode allows the creation of Google Spreadsheet-like tabular datasets, which can then be fed into Views or whatever.

What other options exist? Would this be a good candidate for one of the NoSQL modules? Which are best for performance? What have you used personally?

Comments

Large datasets has always

sinasalek's picture

Large datasets has always been an issue.
One of the best solutions is not to show that much data at once, because even if could show it human eye / brain can't process it. That how most spreadsheet applications work they only display what can fit in user's screen and load the rest by demand. however they also have internal cache in case that the data required calculations as well.
Personally i prefer two methods for data visualization :
- direct : in direct way , the source doesn't really matter, we get the data from anyting even an xml file and convert it to the format understandable by visualziation system
- indirect : visualization system can magically get the data and represent it. an example is relying on views. as you may know Data module is integrated with views which means that any table can be used the data source

Re: Indirect -- Data's really

aendrew's picture

Re: Indirect -- Data's really cool, I've done some small-scale viz stuff with it. It seems like one of two approaches for larger data sets at the moment: that, and using Sheetnode (Which has similar Views functionality, IIRC). Possibly add the Data Visualization API to the same entity-specific category as Sheetnode.

Data's a cool idea for large visualizations because you have more direct access to the database schema -- however, think this'd make it less useful for systems with more users simultaneously working on visualizations? I.e., you have to create a new table in the database (Which might have performance impacts with too many -- granted, that hasn't stopped Field API from creating a new ones...).

I guess my question is, anyone have any idea which approach is better for performance?

--
ændrew rininsland
news, photos, data, code.
aendrew.com :: @aendrew

Store data in Solr & then visualize

batje's picture

We used Search_api to store node related (and other entities) in Solr, and then used facetapi to render data in a way that allowed us to draw some simple graphs. All filtering is done by search_api, so it performs very very well.

You can see the result on Devtrac a site which tracks development in Uganda.

(coming from http://www.devtrac.ug/statistics where you can find more questions that we visualize data of.)

We took up the maintenance of http://drupal.org/project/charts_graphs and its submodules with the aim of first building the graphs we currently have, and extending that in the future to provide more complex data-structures to the submodules that can visualize that data, so you can do things like drilling down on data using links inside the graph, have settings forms that are unique to each plugin and have multi-dimensional data. Building a simple d3js or other charts_graphs submodule should not be very hard and nice to start with.

If you think that is cool stuff to work on, we have a few projects where we are (going to) use this stack and we are hiring:
http://groups.drupal.org/node/295658

Data Visualization

Group organizers

Group categories

JS Projects

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week