JSONP SPARQL module on d.o

Posted by eiriksm on January 18, 2012 at 8:13pm

Hi.

Just wanted to let you know that I just put up a small and simple new module (sandbox atm) that i would love to get some input on from the brave semantic soldiers in this group.

I still have some work planned on it to make it even more universal, but at the moment it supports most SPARQL endpoints out-of-the-box. It is a module to do content enrichment from SPARQL endpoints entirely client side (JSON with callback), so Drupal sites don't have to slow down while waiting for a response from external services.

To put it even simpler: With this you can enrich your nodes with data from DBPedia (or any other SPARQL endpoint).

Anyway, as I said, this post is just to inform that your expert advice and testing would be greatly appreciated.

Have a nice day!

Link to sandbox: http://drupal.org/sandbox/eiriksm/1409172

Comments

Nice job

Posted by scor on January 19, 2012 at 12:31am

This is a nice addition to the suite of RDF/SPARQL modules! I've created a feature request to improve the block forms. Would love to see this module promoted to a full project at some point. Have you thought of using token and entity token, these might be able to save the trouble of using PHP to extract data from a node, maybe you could even use entity token to replace the 'field to use for input'?

This is a very interesting

Posted by linclark on January 19, 2012 at 3:24pm

This is a very interesting approach, I hope that you can develop it into a full project.

I'm not as experienced with using JavaScript with Drupal as some people, but it seems to me that there is a potential security problem with the rendering code in ~~jsonp_sparql.inc~~ jsonp_sparql.js, ~line #70.

When you use Drupal's theme layer on the server, the theme functions should filter the output, using check_plain or another appropriate function. I'm very possibly wrong, but it doesn't seem that the module handles filtering for the incoming data. This could open up sites to XSS vulnerabilities if they are requesting data from an untrusted site or that site gets compromised.

If someone knows more about proper handling of filtering in JS, please chime in. If this module can reach full project status, I think it could be very useful for a number of use cases.

EDIT: Whoops, meant jsonp_sparql.js, not jsonp_sparql.inc

I think Lin is onto something

Posted by clayball on January 19, 2012 at 3:07pm

I think Lin is onto something. I haven't taken a closer look myself. However, the following pages should help.

http://drupal.org/writing-secure-code

The following is posted on http://drupal.org/node/172169

Preventing XSS

All output to the browser that has been provided by a user should be run through the Drupal.checkPlain() function first. This is similar to Drupal's PHP check_plain() and encodes special characters in a plain-text string for display as HTML.

ps. I think the file is jsonp_sparql.admin.inc ??

This is true, and I am aware

Posted by eiriksm on January 20, 2012 at 8:39am

This is true, and I am aware of it. The problem is that this is wanted behavior. I am not sure if I should add an extra permission check possibly, because it is, as you say, open for XSS (I have even tested it and it could easily be manipulated.) It used to be checked for plain values, but this limits the flexibility. Let me explain.

The dilemma is as follows: Each column should be easily customized, so that you can wrap a column label in html code if you want, and make it bold or something. Or theoretically, wrap the entire column in a div if you want. BUT: I also wanted the extreme flexibility that you can actually call a second query on each row. A use case is if you query an endpoint to expect one row back, and this is a DBPedia URI, you could style the column in question something like this

Drupal.jsonSparqlInit('http://dbpedia.org/sparql?query=QUERY_STRING_HERE_WHERE_URI_EQUALS|column name|&format=json',1)

I know that this is pretty insecure, but maybe if only trusted roles got to do this? It's pretty cool, so I don't want to lose the functionality. Very flexible code equals insecure code in this case, I'm afraid. Ideas to keep it both flexible and secure are very welcome.

Anyway, thanks for testing and for constructive feedback. I am planning on applying for full project status at some point, but this is not my day-job project ATM, so development will not be very rapid. On the roadmap is adding support for XML response with the proxy function, so that would probably take care of compability with almost any endpoint. Also, feel free to add feature requests or bugs in the issue queue. Or if someone wants to contribute, that would also be awesome.

interesting point about

Posted by markwk on January 20, 2012 at 8:58am

interesting point about flexibility vs security. If you end up applying for a full project, the reviewer will definitely jump on this point about XXS.

The XSS vulnerability here

Posted by linclark on January 20, 2012 at 1:06pm

The XSS vulnerability here seems to come from two places... one, a person with permissions to edit the wrapping text could add malicious code and two, the endpoint you are querying could have malicious code in its response. While adding role-based permissions would take care of the first, it wouldn't take care of the second.

The actual string that is returned as a binding in the SPARQL result only needs to be sanitized when it is output to the HTML. I haven't walked through the code to determine this, but I don't think this should affect the flexibility, as you would still have access to the unsanitized SPARQL result.

good point. The actual column

Posted by eiriksm on January 20, 2012 at 1:27pm

good point. The actual column could just pass a check before being injected, and if i also add a stricter permission there will be no effect to the flexibility.

I'll have a look at it next time I get the time.

Each column should be easily

Posted by scor on January 20, 2012 at 3:58pm

Each column should be easily customized, so that you can wrap a column label in html code if you want, and make it bold or something. Or theoretically, wrap the entire column in a div if you want. BUT: I also wanted the extreme flexibility that you can actually call a second query on each row.

That's fine as long as you require a high permission for these settings. One could achieve the same kind of XSS with any misconfigured filter (e.g. full HTML). The Security advisories process and permissions policy was recently updated to account for the Drupal 7 "restrict access" type of permission, so as long as the new permission for your module has "restrict access" set to TRUE, the security team won't bug you. Alternatively maybe you could use text formats for these columns markup, so users with a lower permission level would not be able to inject XSS and be limited to simple HTML markup for columns, while trusted users could use full HTML or equivalent to have more flexibility.

i think i pushed a commit

Posted by eiriksm on January 20, 2012 at 9:12pm

i think i pushed a commit with "restrict access" set to TRUE like 10 minutes before you posted that :-).

and thanks for the security link. i guess that means i dont have to drop flexibility over security after all.

It is VERY important to be

Posted by linclark on January 20, 2012 at 9:50pm

It is VERY important to be clear here that the "restrict access" change only addresses the one security vulnerability, not the other.

Eirik, you may already plan to do this, but I just want to make sure it's clear... you will still need to follow the JavaScript coding standards posted by Clay, particularly the part about Preventing XSS, in order to make the module's output secure.

i meant to write it in the

Posted by eiriksm on January 20, 2012 at 10:50pm

i meant to write it in the same comment, but anyway, i also pushed a commit with a check plain on each column earlier today. and as you pointed out earlier, it does not affect flexibility.

thanks again for input. 2 very small, but nice fixes made based on comments from this thread. i didnt expect this many replies to my post :-)

Hey. Just wanted to drop a

Posted by eiriksm on July 31, 2012 at 4:24pm

Hey. Just wanted to drop a line about that this was made a full project a couple of months ago. Some changes were made after i made this thread of course, among them token support as requested by scor (good idea!)

Feel free to use when a client side SPARQL need rises (and feel free to request features, report bugs, or submit patches too for that matter).

Congrats

Posted by scor on July 31, 2012 at 4:42pm

Congrats on making JSONP SPARQL a full project, and on the first 1.0 release. can't wait to try it out!