WebSockets

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
cboden's picture

Last week I ran "bumped into" @Crell at a conference and we had a nice discussion on WebSockets and possible inclusion in Drupal. At @Crell's request I'm doing a followup here in the WSCCI about WebSockets, their usage, the current state and future of.

About WebSockets:

WebSockets are an HTML5 component that allows stateful, full-duplex communication between a server and client (web browser) through a low level TCP socket. This is not your grandfathers internet.
"Stateful" and "full-duplex" imply a few big things.
First, this feature will allow developers to create real-time, event driven applications with small network footprints.
Second, despite being touted as an HTML5 feature, application development is as much (if not more) server side development than front end.
The browsers are already ready. Chrome, Firefox, Safari (including iOS), Opera, and IE10/HTML5Labs support WebSockets. For older browsers there is a wonderful polyfill library out there (https://github.com/gimite/web-socket-js) that implements the WebSocket protocol over Flash Sockets.
That leaves the server side, which isn't as on the ball as the browser makers.

A Shameless Plug:

Introducing Ratchet (https://github.com/cboden/Ratchet). Ratchet has been my pet project for the past couple of months. It is a loosely coupled, component library for building socket-based applications on PHP5.3 (PSR-0 compliant). It's core architecture utilizes a Decorator pattern, with a Server and WebSocket component implementation. At this time your WebSocket application would have to be run from a shell. Once Apache/Nginx/etc catch up the idea is to drop the Server component and hopefully the WebSocket component, leaving your application stack still working. As @Crell said, it's a "Node replacement" for PHP. Ratchet is in a pre-alpha stage right now, but working* in all browsers. The WebSocket implementation works with the browser implementations but needs more work to meet the ietf standard.

Drupal Integration:

Implementation ideas could do amazing things. Upon saving a node a real-time update could be applied on other viewers' browsers. WebSockets make this fast and easy (programming is different, event-driven, vs traditional long-polling and mucky database querying). The question is how? This is what the discussion needs to be. Currently, my thoughts were running a simultaneous Ratchet stack adding Drupal components in at runtime. With Drupal moving to a loosely coupled, PSR-0 model, a user plucking components and using elsewhere would be easy. One other thing to consider: by nature WebSockets are RPC based, which goes against the REST movement. However, my next task is implementing a WebSocket Application Messaging Protocol (WAMP) component. It's a JSON-RPC sub-protocol that I believe can wrap REST structured requests, making the Drupal routing integration easier.

What Now?

This has been an incredibly un-technical briefing. If you have any technical WebSocket or Drupal integration questions, I'll try to answer any of those in this thread. If you're interested in trying out WebSockets in PHP please checkout Ratchet, I would love a new perspective, input, feedback, and of course contributors. If you have any issues/questions about using Ratchet, hit me up in Google Groups (https://groups.google.com/group/ratchet-php).

Cheers.

Comments

Approach moving forward

Crell's picture

Thanks, Chris!

I think the key question is how we want to approach web sockets in Drupal. One option is to try and support them from within PHP, like Ratchet. The benefit there is that we have access to all of the usual Drupal code libraries (at least once those are more cleanly factored), and everything is in only one language which makes life easier for developers.

The disadvantage is that, really, PHP is not designed for that sort of thing. The entire runtime is built around the expectation of short-lived request/response cycles. When I try to use PHP for CLI work even other PHP devs yell at me. ;-) Effectively daemonizing it could run into all sorts of scalability questions in the engine itself. In that case, it may be better to standardize on, say, really beefing up our Node.js integration and making that our answer to Drupal-and-websockets.

If we decide to go with the first option, then from what I have seen Ratchet looks like a good foundation. It's definitely still pre-alpha but the approach looks sound. If we decide to go with the second option, we'd need to decide what to hitch our cart to, so to speak, and how we would want to communicate with it.

Parallel Implementations

mikeytown2's picture

I would +1 for something like Ratchet. I believe you said one needs 3 implementations of an API; others agree with that idea. Having the resources to pull it off is another matter. If one approach is more "stable" I would pick that as the reference and work on both in parallel. Having a good starting point with both Ratchet & Node.js makes parallel development work possible. Creating our own PHP websocket library is probably a bad idea, but improving one makes this seem less crazy.

The more I learn...

cboden's picture

"The more I learn, the less I know". I've been reading up on The C10K Problem and how Apache, Nginx, and Node handle requests and how they deal with scaling.

I understand what you were telling me at DIG about a messaging/queueing system now. I'm going to look into some implementations (ZF2's Queue lib looks good, implements DBs, STOMP, MemcacheQ) or develop my own interface and create 3+ adapter classes for (looking at RabbitMQ, ApacheMQ and a common Memory). This will allow the application to scale horizontally. Has there been any progress on Drupal 8 Queue's refactoring?

I recently ran a load test and Ratchet (surprisingly) ran admirably. On a 2GHz dual core x86 with 4GB of RAM - 8 of my co-workers opened a total of 400+ browser tabs as well as 60+ shell scripts that each sent 512 - 10000 (random) bytes per second. The server app (Ratchet's demo Chat app) decoded and re-encoded all messages and sent it to all open connections. So I calculated the server received 342kb per second and sent 150mb back per second (local network). This ran for a couple hours with the single php application consuming ~85% of one processor and 4.2% of the RAM. Between all the spamming everyone was able to use the chat to actually have a real-time conversation in the browser. One thread, memory storage. Not bad?

To increase performance I'm looking into a couple things: I'm going to look into PHPs module of libevent. I will create a new implementation of my SocketInterface to use streams and then have the ability to use libevent instead of socket_select. This is similar to the event loop Node uses and part of what they claim makes it so fast. As well, I'll see about (maybe end-user-dev driven) forking the server per processor. With an AMQ they would be able to communicate nicely.

Another performance option I'm looking at is PHPs module of Gearman to run tasks asynchronously. I'm not sure if I'll incorporate this into Ratchet or recommend it to users with process intensive tasks through WebSockets. I need to do more research first.

Lastly, I'm looking into adding a client/server messaging protocol with an RPC and pub/sub component. This is something Drupal might be interested in.

My company is facing the same dilemma: do we look to Node for I/O, learn a new language, support two languages in 1 project and figure out how to make them talk, or can we make PHP work and re-use all our code? With my latest load test we're leaning back to all PHP. PHP's CLI has come a long way. As I said, I hope the Server component of Ratchet is replaced by a more robust library in the future (written in a better language). For now, it seems to do the job for small to medium sized sites. Although the traditional nature of PHP is short-lived scripts, the nature of WebSockets is long-lived applications. I don't see a way around this (I could be wrong).

What about Drupal? Is the community/project willing to adopt/support two languages for the feature of WebSockets? What kind of complications are acceptable to put into Drupal? As you said an AMQ (type) layer is required. Is SQL an option for small-scale? By 2013 will most/some hosting companies support WebSockets to some capacity? These technical questions aside, I think the first question should be "what purpose should WebSockets (real-time communication) serve in Drupal 8 and/or beyond?".

I don't see Ratchet's Server component as the answer, but perhaps a temporary development solution. As @mikeytown2 said, I agree with the supporting 3 implementations philosophy. With Ratchet I don't want to re-invent the wheel, but bring a standard, easy solution to PHP developers.

NodeJS

fenda's picture

Supporting WebSockets in PHP would indeed not be worth the effort.

NodeJS with websockets is a great combo. Thoughts?

Evidence?

Crell's picture

We need empirical evidence either way. Just saying "X would be great" is not really useful. Why would PHP websockets not be worth the effort? Chris' post above suggests that it might scale better than we expect, and means we can still share code directly between requests and socket-mode communication.

DNode

Crell's picture

This is also an interesting thing to watch:

http://bergie.iki.fi/blog/dnode-make_php_and_node-js_talk_to_each_other/

Would that make bridging Node.js-based websockets to PHP-based Drupal easier?

Started on a dnode-drupal integration module

frega's picture

hi,

i started on a small dnode integration sandbox a little bit back - there are actually two sandbox that are proof-of-concept quality.

dnode - http://drupal.org/sandbox/frega/1321342 - basic dnode-drupal integration (incl. rudimentary rules integration)
dnode_faye - http://drupal.org/sandbox/frega/1357240 - integrating drupal with the lovely cometd/pubsub server faye[1] (uses i.a. websockets).

dnode is certainly an easy way into the world of asynchronous rpc :)

i've tried to document how to set it up in the INSTALL.txt etc; it can be a bit complicated. if you have questions, don't hesitate to ask here or in IRC :)

websockets, imho, are probably just a "transport" - the most important aspect of all this is that drupal can play "nice" in a distributed, polyglot environment (aka the internet) ... there are obviously a lot of interesting ways / challenges how to integrate dnode/async rpc with the services module and of course the d8 stuff.

[1] http://faye.jcoglan.com/

much as I'd like to play with

kucerar's picture

much as I'd like to play with eventmachine and node, I would try to do the non-blocking IO layer in PHP. Maybe I'd learn something (libevent, C10K etc).

Found these two today while researching libevent:

http://nanoserv.si.kz/

http://toys.lerdorf.com/archives/57-ZeroMQ-%20-libevent-in-PHP.html
http://www.zeromq.org/bindings:php

(0mq worth noting but it is another kettle of fish, people talk about routing 0mq protocol over websockets, conversely nanoserv may still be a good focal point, just want a simple nio layer).

Not sure how much effort it would be, but it doesn't seem too bad in concept. Would be nice if Nginx handled the nio layer entirely, but I've only seen example of it proxying back to a special daemon (nio server in whatever language).

Nginx IO

cboden's picture

Thanks for the links @kucerar! The 0mq + libevent looks especially interesting.

On the note of Nginx handling IO; I've been poking around something very intriguing lately. I've come across a library called AppServer in PHP (AiP).

AiP works by running as a PHP daemon beside Nginx. Your application (such as Drupal) runs once as a long-lived application on top of AiP (as apposed to bootstrapping on every new connection). A connection comes in to Nginx and passed to the running AiP, handled by your PHP application, and the result returned back to Nginx. To do this frameworks need slight adjustments to work on AiP. At a hackathon someone altered TYPO3 to run on AiP and reportedly ran 3-4 times faster. With the level of abstraction Symfony2 has, someone made a bundle that makes all of Symfony2 run on AiP.

If I get this working I think this would fix my issue of needing a second server to run Ratchet. A PHP website and Ratchet WebSocket app could run on the same server simultaneously.

I'm currently working on implementing AiP + Symfony2\HTTPFoundation + MidgardAppServerBundle to get Sessions working between any PHP web app and Ratchet.

Looks like there already work

ckng's picture

Looks like there already work started on this
https://github.com/freudenberg/drupal-on-aip

CK Ng | myFineJob.com

hm, is that a fork of drupal

kucerar's picture

hm, is that a fork of drupal to make it run on AIP?

Not a fork AFAICT. Additional

ckng's picture

Not a fork AFAICT.

Additional files are:
Drupal.class.php
aip.yaml
sites/all/modules/custom/aip

CK Ng | myFineJob.com

not a fork yet

Stefan Freudenberg's picture

This is an experiment of running Drupal on AiP. Actually it would require to fork Drupal; most important it needs a Session middleware to replace Drupal's session handling. Parts of the bootstrap process should be refactored to take full advantage of running on an applications server. In the current bootstrap there's no clear distinction between things that are actually bootstrap and things that must be handled per request (like language detection). The reason is obvious.

During development AiP can handle requests without staying behind a more potent web server like Apache or Nginx. So it would be more like developing rails, django, etc applications. Also something to consider.

your welcome :-) but lookee

kucerar's picture

your welcome :-)

but lookee here:
https://github.com/kakserpom/phpdaemon

this one just saved my interest in PHP,,, wow! who needs ruby and the rest--if you cant escape php you cant escape php, thats just the way it is...

Russian-built looks like.

So, if you're running the built-in FPM, looks like phpdaemon could replace that as well (it's also a fastcgi process manager). I think I prefer the phpdaemon over aip, for the moment. Will have to delve into it.

Should we wait for hybi

vivekkhurana's picture

Should we wait for hybi standard with will replace the websocket https://datatracker.ietf.org/wg/hybi/charter/

WebSocket Versions

cboden's picture

I would recommend so*.

WebSockets have gone through several iterations, but you can think of it like this:

Alpha - Hixie
Beta - HyBi
Final - RFC6455

There's very little difference between HyBi (10+) and RFC. Hixie however, is very different. It's not as verbose, has fewer features and has proven to be vulnerable.

Currently only Safari still supports Hixie. IE10 has support for WebSockets (HyBi-10) in beta. For now, the best solution is to use a Flash polyfill that implements the WebSocket protocol (RFC) over Flash Sockets.

Ratchet supports Hixie, HyBi and RFC6455 but can be configured on runtime to disable support for any specific version.

By the time Drupal 8 launches, every browser will implement RFC6455.

HTML5 Server-sent events

jherencia's picture

I think it could be usefull to have this in mind too:
http://dev.w3.org/html5/eventsource/
http://www.html5rocks.com/en/tutorials/eventsource/basics/
http://en.wikipedia.org/wiki/Server-sent_events

It seems, correct me if I'm wrong, that there could be some situations in which it would be better to use EventSource rather than WebSockets, especially when the client doesn't have to send messages and current web servers support this aproach.

I just wanted to point this to take a look at it, I know WebSockets would be much better but in case it couldn't make it till D9, it looks like this is easier to achieve.

EventSource and WebSockets

cboden's picture

Nice find. EventSource has been getting some attention by browser makers recently.

After much work and research, I don't think WebSockets can have a place in Drupal 8, unfortunately. Every solution I've seen requires at least command line access and a lot of implementations are requiring something like ZeroMQ (which is awesome, btw). Since Drupal's minimum requirements (paraphrasing here) are that it runs on a LAMP (like) stack that's on shared hosting via FTP, WebSockets can't fit in there (right now, or for the foreseeable future).

EventSource might be able to though. Since it doesn't require its own protocol it might fit in nicely. I don't have any experience with them though, so I can't comment for sure.

Not a deal killer

Crell's picture

To be clear, I am completely OK with "WebSockets only work if you have shell access" as a requirement. It's an advanced, cutting-edge feature and it makes total sense that it would only work if you have admin access to your server. That just means we cannot leverage it for any Drupal-core critical functionality.

That said, I'm also OK with WebSockets being something that live in contrib if that's a better place to let it develop. We just need to be mindful of that and ensure that Drupal core makes it possible to write WebSocket support. My gut feeling is that requires the same sort of "pull things apart and inject dependencies" work that we're trying to push elsewhere, but a clearer picture of that would be good.

I am also fine with exploring EventSource, too. My main interest in both areas as far as Drupal 8 is concerned is "what do we need to do to Core to make this easy to do", not necessarily doing it in core this version.

That's good to hear. I'm

cboden's picture

That's good to hear. I'm glad to hear the spirit of "what do we need to do to Core to make this easy to do".

The demo I gave you, Larry, had an application run in parallel on Ratchet with that of a website and I said the best way to integrate was to decouple an app and re-use code on both stacks. While this is still true, I have plans to loosen this constraint. While I believe Ratchet will always have to be run in parallel to a traditional web stack, I'll be writing a component giving developers the option to remove logic from Ratchet, keeping it in their website and notifying Ratchet of changes, in turn notifying clients.

I share your ideals though, in that my first goal with Ratchet is it should be easy for developers to use.

EventSource

Crell's picture

So to continue my previous comment... what would it take to make EventSource support in Drupal "really easy" (regardless of whether it lives in core or not for now)?

The PHP and Node.js examples in the second link seem to suggest that the server side is still treating it like a poll; vis, it connects, sends a response, and then the script exits. Is that just a dumb example, or is that really how you're supposed to code it? Because that makes no sense to me at all. :-) I'd imagine you'd want some sort of streaming response from the server.

Assuming for the moment that you actually do want a streaming response, how would that work in practice, and what would we have to do to make that clean and easy to implement? Chris, any thoughts?

Adding EventSource

cboden's picture

I do believe a streamed response would make the most sense. There is a StreamedResponse class in Symfony's HttpFoundation. For the most part I think this would be easy to code on both sides (PHP/Javascript).

I think the challenge will be for core developers. With an open connection resource management will have to be well managed as well as triggers. If there's an open connection to Drupal how much memory will be kept while that connection is open? My second concern is how will the streamed connection receive an update?

For example, let's say this node has an EventStream for comments. As a user while I'm on this page the EventStream would automatically add new comments to the page without refreshing. On the server streaming side, how does my open connection know when you add a new comment? I would think a listener type interface would sit until it receives a message. Several implementations could exist in Drupal to inform the listener. These implementations could make use of engines like ZeroMQ, Memcache, Shmop, or SQL.

A large benefit of using EventSource over WebSockets is that it can entirely fit within an ecosystem, rather than requiring a parallel process.

Ratchet

jackbravo's picture

Ratchet github page was removed because of IP problems. Is there a new name for the project? Or is the project on a dead end?

A company, who had hired the

cboden's picture

A company, who had hired the company I work for, has tried to claim IP ownership of Ratchet. The issue is currently being handled by the lawyers. While this is happening, I was advised to take Ratchet off github.

I'm told, given the nature of Ratchet (a transport library following an open specification) - we should be able to re-release it, as it was.

My apologies for the inconvenience.

Back

cboden's picture

Ratchet is back on GitHub! Now with two new development branches including functional Symfony2 Session integration and libevent for async I/O

Good luck!

R.J. Steinert's picture

Good luck!

NodeJS Module

ethanw's picture

On the other side of the equation, the Node JS module (http://drupal.org/project/nodejs) implements a Node-backed WebSocket implementation in Drupal. As I understand it it provides a Drupal API implementation to queue messages in a Node server app, and the Node app then dispatches those messages to subscribed clients.

Will nodejs add complexity to infrastructure ?

cloudbull's picture

Drupal is going to enterprise, while new server app like node.js will add complexity to infrastructure, E.G. Acquia Cloud dont have server side node.js support.
Just wondering isnt nodejs will be a good direction to go......

I think html5 may be a better option..

Client vs. Server

ethanw's picture

I don't think they are exclusive. WebSockets are an HTML5 standard on the client, and I think the main question is what technology to use to provide WebSockets on the server. While there are PHP options, as Crell has noted PHP just isn't designed for that kind of behavior. NodeJS is a powerful option, well suited, which feels like a better fit that other languages since JavaScript is already part of the Drupal stack, if not on the server side.

The bottom line is that hosted platforms like Acquia Cloud will need to offer server-side support for WebSockets one way or another, and it likely will need to be at the webserver lever, not just Drupal configuration/module installation.

I've got the feeling that if we try to go with a PHP solution we'll be in for a world of pain and poor performance that hamstrings real-time Drupal app development. But that's just a gut reaction, I'd be interested in seeing some numbers about PHP WebSocket performance and hear from some who have tried to use it.

PHP has evolved

cboden's picture

I believe PHP is a viable option. Not all shops, such as the one I work for, want to maintain/train for polygot systems.

PHP4 could not do this. It's come a long way with process handling, garbage collection and better overall performance. On PHP5.2 I ran daemons for weeks without incident for a high traffic financial corporation. I've read people say they've run daemons for months on PHP5.3.

I/O is the real concern. Node boasts performance gains because of asynchronous I/O. PHP, by default is not, as its original nature was request/response. However, this too has changed. What makes Node capable of async I/O? Nothing that special, it uses an external library to do the work. It's called libev. There is a libev extension for PHP but it's in early alpha. There is another, called libevent, which is ever so slightly slower than libev but 10x faster than traditional socket polling and does asynchronous I/O. I've read people have comparable results with PHP/libeven to that of Node/libev (sorry, can't find link).

I've implemented libevent into my PHP library rather easy and I will be doing performance tests when I get around to it. For the current stages the benchmarks I provided in an earlier comment make me happy enough for now, I feel my time is better spent on features at the moment.

All that said, I also believe when Nginx finally supports HTTP/1.1 there will be no need for PHP to manage I/O directly with clients, it will do so locally to Nginx alleviating these kinds of concerns.

phpDaemon

ethanw's picture

phpDaemon looks like an interesting option: https://github.com/kakserpom/phpdaemon

If there is a mature, well performing PHP option for WebSockets, I'd personally feel okay deploying it. That said, I don't see a lot of case studies or examples online.

I wonder if it would make sense to follow the NodeJS module's architecture somewhat and decouple the WebSocket message server from the Drupal stack, allowing for plugable WebSocket backends. That would allow sites to stay PHP only or go the NodeJS route as fit their needs and capabilities.

Modular

Crell's picture

My feeling at this point is that our best bet to make Drupal 8 more Websocket friendly is to work toward the same sort of decoupling that you mention; more specifically, we want to make more Drupal systems free-standing and re-entrant; that is, no dependency on HTTP whatsoever, and no globals. All dependency-injected objects.

That way, whatever PHP-based daemonizer or websocket tool we end up recommending we can still load and save nodes, set user settings, etc. without worrying about memory leaks, global pollution, etc.

Yesterday @lsmith retwetted

jherencia's picture

Yesterday @lsmith retwetted this https://github.com/igorw/SocketServer

Maybe it's good to take a look to it.

FYI, this project has moved

igorw's picture

FYI, this project has moved to https://github.com/react-php/react and has gained a bit of a wider scope. In fact, Ratchet has been rewritten to build on top of it.

Always up to date with our community

enfusion's picture

I really like the community to keep abreast of new technologies and their proper use.
http://www.enfusion.es/diseno-web-a-medida/

enfusion This project has moved

Alyen's picture

This project has moved to looks like an interesting option: http://tecnux.net

Beginnings of Drupal 8 + PHP-PM + ReactPHP

kentr's picture

https://github.com/kentr/php-pm-httpkernel/tree/kentr-drupal-bootstrap

There are some issues, but a basic integration is there.

PHP-PM daemonizes PHP and starts up a ReactPHP loop. This is the setup described in Bring High Performance Into Your PHP App (with ReactPHP), which looks to be getting good support from the Symfony / Laraval end.

WebSockets component

shopdoccases.com's picture

I just met the WebSockets component for html5 and looks great to implement in future web thanks.