HTTPRL Module: Callbacks & Background Callbacks Now Possible

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
mikeytown2's picture

I'm pretty excited about some recent developments I've recently done in the HTTP Parallel Request Library. Thread that has all the dev info: http://drupal.org/node/1427958

You can now issue a callback in the event loop (just like node.js) so while your function callback is running, I/O is still going on in the background (http requests currently), filling up buffers waiting to be read after the callback is done executing.

The other thing you can do, is use what I'm calling a background callback. It allows one to run a function in another HTTP Process; waiting for the return or not waiting for the return. If you wait for the return, it's a way to do a bunch of things in parallel and assemble them back in the parent thread; passing variables by reference does work; just be aware that the children can't coordinate with each other, so this is best used when manipulating different things. If you do not wait for the return, it a great way to do a lot of things in the background that doesn't directly affect the current request.

What I'll be working on in (hopefully in the near future) is another top level API that allows one to pass in a list of functions and their arguments, set timeouts for each of them, and set other things as well; running all of them in multiple background processes. This opens up the door for lots of cool things to happen; and this is really close to becoming reality; I just need to extend the code I already have. What I have right now is SO close to generic multi-process php (http://groups.drupal.org/node/119109 http://groups.drupal.org/node/209353). Tie it into the Drupal queue and batch system and we got our self's a really good use case. Cool thing is this doesn't require anything fancy from PHP, it just works with the standard, already included, php 5.2 functions.

Something to be aware of: The APIs for these two new features might change in the near future as the result of more testing and development.

Comments

Looks very cool, I wasn't

Mark Theunissen's picture

Looks very cool, I wasn't aware such a project existed.

Example code

mikeytown2's picture

The test_callback function needs to be available everywhere so it's best to define it in a .module file. This is a simple demo of what can currently be done.

<?php
  $b
= 'b';
 
$c = 'c';
 
$d = 'd';
  echo
$b . ' ' . $c . ' ' . $d . "<br />\n";

 
// Setup the options.
 
$options = array(
   
'method' => 'HEAD',
   
'background_callback' => array(TRUE, 'test_callback', &$b, $c, &$d),
  );

 
// Queue up the requests.
 
httprl_request('http://www.drupal.org/', $options);
 
// Execute requests.
 
$responses = httprl_send_request();
  echo
$b . ' ' . $c . ' ' . $d . "<br />\n";
  echo
$responses['http://www.drupal.org/']->background_function_return_value;

function
test_callback($a, &$b, $c, &$d) {
 
$b .= ' pass by reference test';
 
$c .= ' pass by reference test';
 
$d .= ' pass by reference test';
  return
'x-cache-hits: ' . $a->headers['x-cache-hits'];
}
?>

output from above code.
b c d
b pass by reference test c d pass by reference test
x-cache-hits: 46

As you can see $b and $d where passed by reference and where thus changed in the parent request. Right now this is tied to a HTTP request, but making this generalized would make this very powerful for obvious reasons.

@mikeytown2, without

pribeh's picture

@mikeytown2, without understanding the totality of what you're doing, I have a query: have you done any performance tests to see how many co-current requests you can do via this method (either blocking or non-blocking).

Testing

mikeytown2's picture

Non blocking mode can send off 1k requests in about 100ms; if the target can handle that many requests is another question.

Blocking mode I've limited it to 8 per domain and a total of 128 open connections. These limits are variables so they can easily be changed. How many connections it can handle seems to be dependent on the max number of file descriptors of the server.

In short it can do a lot of requests in parallel.

Edit:
Also I've generalized this now, so you can run any function in blocking or non blocking mode.