Posted by Anonymous on January 28, 2011 at 6:53pm
Finally scratched an itch and created a Node.js Integration module last night.
The code is pre-alpha, and will eat your lunch and kill your kittens - please do not use the code unless you are a developer looking to help beat this in to shape.
Feedback and co-maintainers welcome!
Comments
Nice!!
Hey,
This sounds exciting, I have looked at node.js many times though have not used in any of my projects. Which use cases are you planning to solve? How does it integrates with drupal ( at what level and for what use cases). I will be trying this out, but some detail on your plans for the integration will give a background.
Cheers
Dipen Chaudhary
Founder, QED42 http://www.qed42.com Drupal development
Dipen you might also be interested in
... Nodepal - A Drupal integration layer for Node.js developers
This is a module that allows the integration of a Drupal installation with a custom Node.js app. It provides an API so that Node.js developers can directly read and write in Drupal's repository, using constracts like node, user, permission, etc., with having to worry about the underlying implementation and setup of the Drupal installation.
Use Cases:
As far as I know, LimitedList (http://limitedlist.com/) is using Drupal 6 + Nodepal + Node.js in production.
not the approach i want to take
Nodepal is cool, but its not what I'm interested in pursuing.
Personally, I don't want something that is so tightly coupled to the innards of Drupal.
I have no interest in using node.js to directly query mysql.
I'm interested in building integration over http, where the node.js code is as dumb and as simple as possible.
If done well, the API this module exposes should be easy to use from Wordpress etc, because it's really just a realtime pipe.
What about performance?
From the FAQ:
It would be like getting an F1 formula car (Node.js) and tying a horse (PHP stack) on its rear bumper. The two of them will only go as fast as the donkey :-)
i'm happy that nodepal works
i'm happy that nodepal works for you, and it is fast.
however, i'm not interested in something that is unable to cope with important features of a typical drupal site (like, say, new modules or configuration that alter data structures, or, like, say, not using mysql, or, i don't know, say, caching). i don't want half-baked support for my site, i want complete support, which means keeping the node.js layer dumb, and allowing modules to use it for realtime communication. that's the primary design idea for the nodejs module - allowing scalable realtime features in drupal without eating your server.
if you were to add full support for drupal's power to nodepal, it will get bigger, and bigger, and bigger. and will need to track changes in everything that it supports. no thanks.
as for performance, its kind of hard to reply constructively to such contextless performance claims.
It's true that one size does
It's true that one size does not fit all, and I just wanted to make clear in which cases Nodepal should be concidered.
As for performance, isn't that the reason you care about Node.js at all?
"As for performance..." no.
"As for performance..."
no. is there anything, anywhere, where i said performance was the key reason for the nodejs module? please, read, consider, then post.
the nodejs module is intended to provide a scalable way to handle realtime communications from drupal. the standard LAMP stack doesn't cope with persistent connections well at all, whereas node.js does.
but just as importantly, i want nodejs integration to be maintainable and as rich and powerful as the modules on the site it integrates with. i have zero interest in writing large amounts of new js every time something new is added to my drupal site.
s/performance/scalability/g
s/performance/scalability/g
Different use cases, you're
Different use cases, you're writing something for PHP developers. That's great.
I hate PHP. It's grimy and annoying.
Using your integration I would not be able to avoid PHP and would not be able to develop functionality in JS.
So it's just a different approach.
There's no right or wrong, just different stuff for different people.
Would this be accurate:
nodepal = write nodejs apps that uses functionality from drupal
nodejs integration = write drupal modules that uses functionality from nodejs
this is the most intelligent
this is the most intelligent comment i've seen so far. thanks, Charuru, you got it in one. nodepal completely misses big parts of drupal's functionality, like, say, the theme system, alter hooks, etc, etc, which makes it a total non-starter for those who want the power of drupal with real-time functionality.
the functionality i want to use from node.js is the ability to hold open a socket to many thousands of clients concurrently without tying up an apache child process. node.js is very, very good at this, and is in my opinion easier to work with than twisted, which was the other candidate i considered.
If you want to avoid PHP,
If you want to avoid PHP, avoid Drupal. Drupal's schema is built for Drupal's PHP code and if you're not using Drupal's code you probably don't want Drupal's schema.
As stated above, Nodepal speaks directly to MySQL bypassing Drupal and all of its hooks. For Drupal, that is really broken.
From the code:
So to be able to handle CCK data (critical data for virtually every Drupal site) we would need to re-implement CCK's hybrid storage model and try to parse the serialized php in
cache_contentto figure out what to load from where? Or hard code it on a per object basis? (I did see exports.getField which takes a first crack at loading cck fields on demand but this is extremely brittle and not a real solution because the storage model of a field can change at any time just by changing the setting on the field or adding the same field to another content type at which point the method used is completely broken).What about input formats? Drupal content is created with the presumption that input formatters will be processed for display ([#somenumber] turned into a title and link to a node on drupal.org or embedded media being inserted inline not to mention the security concerns with circumventing the HTML escaping).
I'm not trying to be rude, but while nodepal is a cool proof of concept I'm not sure I can think of a use case where it would be helpful.
Personally, I want to move
Personally, I want to move the chatroom module to use node.js as a backend when I port it to D7.
Polling sucks for realtime, so I want this module to provide a backend for other modules to do realtime stuff without it.
a polling option to chatroom
a polling option to chatroom is important, as lots of drupal users relay on shared host accounts, which wont have node.js and the other modules installed
regards
Feijó
The reason nodejs exists IS
The reason nodejs exists IS SCALABILITY.
Honestly, I see no point in tying it with Drupal.
Justin, I don't know if you care about performance with chatroom:
But if you do, "phpless" connection to nodejs is the way for scalability. Although you did a good work with chatroom, scalability is the only thing preventing me and lots more from using chatroom ATM.
sigh. please, stop with the
sigh. please, stop with the scalability FUD. a drupal setup that offloads the realtime push functionality to node.js will be able to scale the number of open sockets very well - node.js is very, very good at that.
but the hardest part to scale with LAMP in general, and particularly drupal, is the database layer. nodepal does nothing to make scaling this layer easier. worse, it calls directly into the database, completely bypassing any caching layer.
so, you'd need to build your own caching layer in JS. and choose between completely ignoring drupal's cache system (yay, lets code that up twice just for fun), or try to hook into drupal's cache system (yay, we're now tightly coupled to code in another system). win-win.
what node.js is very, very good at is scaling out IO, particularly network connections. which is awesome, if that's your pain-point. but on a well tuned drupal setup, the PHP heads are compute bound, because something like varnish handles slow client connections and caching static files, and caching via memcached/APC stops most database hits. so, the requests that hit PHP have very low network latency and require immediate cpu-intensive activity to serve.
in fact, often response time is the key, not number-of-concurrent-requests-per-box. in that case, you often want to scale back the number of concurrent connections that hit your php heads.
Both approaches have their use cases
Your push approach makes sense... reminds me of the Juggernaut project (pushing from Ruby, instead of Drupal, to node.js). I can see nodepal being useful for some things as well (using Drupal for initial authentication, for example).
Use case
Hi,
I like the idea of using a REST interface (JSON over HTTP) to access Drupal's data. It allow loosely coupling between the Node.js application and Drupal. The Services module could be used to easily expose the needed resources to the Node.js application.
Here is a what I think is a use case that can benefit of a Node.js Integration for performances and scalability. In this scenario, the cost of the full Drupal stack is lowered by sharing a single HTTP request to the Drupal's Apache server for multiple client requests.
I implemented a similar solution some month ago. The messages where in-game notifications in a JavaScript poker client application embedded on a Drupal website. The notification where triggered by events from both the poker backend (a Python/Twisted server) and the Drupal site (friend login/logout, private messages, etc.). The client was initially implemented for direct communication with the Python poker server using HTTP poll. It didn't scale well on Drupal/Apache2 and we had to implement a solution much like this one. Because it was a very specific solution, we didn't use Node.js but a custom micro-HTTP server implemented using phpsocketdaemon doing DB query (using PDO). For a generic and re-usable solution, I would have pushed for loose-coupling between the message server and Drupal using a REST-like interface.
Great
So Great!!
Learn everyday
justinrandell: This diagram
justinrandell: This diagram describes what I think you explained to me at DrupalDownunder Brisbane about how you planned to architect the node.js layer for chatroom. Is that correct?
Which issue nodes on drupal.org are the best place to follow progress on this and possibly test code out?
Bevan/
Bevan - yep, using http
Bevan - yep, using http requests between node.js and drupal, and putting the 'always on' burden on node.js and the rest on drupal is how the experimental code I've written so far works.
I'm on holiday right now, so i haven't gone much beyond proof of concept yet.
There aren't issues yet, but I'll be creating them when I get back to Sydney next week.
The code doesn't do too much yet, but if you want play, check out nodejs module code from CVS and give it a whirl.
Seems like a clever way to
Seems like a clever way to push out updates from server to clients...but how would you handle many concurrent PUTs? Seems like inserting or updating the database would be a bottleneck. Or am I missing something?
Nope, you're not missing
Nope, you're not missing anything.
Scaling many concurrent database writes wouldn't be solved by using Node.js.
Not entirely sure what your point is though? Any high-write traffic site, with or without node.js has to solve this problem. Are you looking for a (largely irrelevant to node.js integration) discussion about scaling the database layer?
Not really trying to make a
Not really trying to make a point, just trying to understand things more. The main advantage I see in using node.js is that it offloads the weight of concurrent open connections between server-client reducing memory usage, correct? With other methods like polling, lots of users, regardless of how inactive they are, might exceed the server memory limit or cause a "too many connections" type error.
But I see now that latency is a separate issue, and largely dependent on how non-blocking the PHP code is and database layer are, and to make them non-blocking would require huge amounts of rewriting. Not to mention all the libraries they use too. Even without node.js, reducing latency would require optimizing lots of steps and caching.
Anyway I think I understand your goals now, and it seems like a useful alternative to long polling / persistent connections.
Subscribing
This is exactly the use case I am trying to solve, avoiding the too many connections associated with long polling and/or the cpu limitations on short polling. Sounds like a positive path forward.