D7:Static HTML , have to hack core ? need help

haojiang's picture

need help:

i just received an order to build a website using drupal 7 and the most difficult part is "to generate static html page for nodes/entities"

1.boost do not support drupal 7 , so , seems that i have to write a custom module .

2.i found that i have to hack core to accomplish :
include/common.inc/ function drupal_deliver_html_page($page_callback_result);
in this function , change line "print drupal_render_page($page_callback_result);" to the following two lines , and everything is fine
$s=drupal_render_page($page_callback_result);
generate($s);//custom function , mainly fwrite

3.but the question is : are there anyway/hook to avoid hacking core ?

Comments

Is there a reason they

ISPTraderChris's picture

Is there a reason they couldn't simply use varnish (or another reverse-proxy cache) to deliver cached static versions of html pages?

Simple with D7 use hook_exit

mikeytown2's picture

create a module called "mymodule"

<?php
function mymodule_exit() {
 
// Get the important data
 
$path = request_uri();
 
$data = ob_get_contents();

 
// Get filename info
 
$directory = 'cache/' . dirname($path);
 
$filename = $path . '_.html';

 
// Write Info
 
if (!is_file($filename)) {
   
mkdir($directory, 0777, TRUE);
   
file_put_contents($filename, $data, LOCK_EX);
  }
}
?>

This code is not production safe. Do not put on a web facing server. It's also untested but it should work. Requires a 777 dir called cache next to your index.php file.
notes:
http://api.drupal.org/api/function/drupal_page_footer/7
http://api.drupal.org/api/function/drupal_page_set_cache/7

WOW , thx a lot

haojiang's picture

WOW , thx a lot
change my code now.....

best solution Help mikeytown2

404's picture

best solution

Help mikeytown2 to port boost to drupal 7. :) THE BEST WAY TO GO! Give a try, trackself :)

My half-baked solution

I haven't figure it out for drupal 7

  1. instruct drupal 7 not to set cookie for anonymouse users

  2. use nginx as reverse proxy

  3. with a conf like this

    server {
          listen       80;
          server_name  example.com;
          #access_log  logs/host.access.log  main;
          access_log   off;
          root   /var/www/pf;
          index  index.html index.php;
    
          # the AddHeder directive for header cache contrl must come before 'location ~ .php ' strange!!??
          location ~* .+\.(ico|jpg|gif|jpeg|css|js|flv|png|swf)$ {
           expires max;
    
          # ask nginx not to process last-modified header, save cpu power
          if_modified_since off; 
    
          # set last-modified and etag to blank so browser 
          # wont bother with sending them????? I am not sure, doesn't seem to work 
          # better  way is to use nginx header_more module to hide these two headers
          add_header Last-Modified "";
          add_header Etag "";  # nginx doesn't process etag anyway 
    
           proxy_cache cache;
           proxy_cache_key $host$uri#is_args$args;
          # cache pages with http status code 200 and 304 for 12 days.  
          # you  can't set it forever, but you can set it to 10y, 10 years
           proxy_cache_valid 200 304 12d; 
           proxy_cache_valid 302 301 12d;
           proxy_cache_valid any 1m;
          }
    
            location / {
             proxy_pass       http://apache;
             proxy_set_header  X-Real-IP  $remote_addr;
             proxy_set_header Host $http_host;
    
          if_modified_since off;
          add_header Last-Modified "";
          add_header Etag "";
    
    
             proxy_cache cache;
             # we use pressflow and brianmercer's nginx_header module
             # the module is less then 10 lines of code! sweet
             # ref: http://groups.drupal.org/node/79714#comment-247474 
             proxy_cache_key $host$request_uri$cookie_NO_CACHE;
             proxy_cache_valid 200 304 12d;
             proxy_cache_valid 302 301 12d;
             proxy_cache_valid any 1m;
             proxy_ignore_headers Cache-Control Expires;
             proxy_pass_header Set-Cookie;
            }
    
     }
    

brianmercer wrote a module for pressflow called nginx_header, use it as reference (http://groups.drupal.org/node/79714#comment-247474 ). He is really helpful, maybe post your questions on nginx proxy_cache + drupal 7 at Drupal groups nginx. I think nginx has a directive to cache pages even there is cookie present.

Don't want to deal with cookies, force reverse proxy to cache everything and log-in the admin through a different port.

APACHE TWEAKING

Before you figure out how to cache pages in drupal 7 it might be good to use some general tweaks mentioned in yslow.

For example, change the default drupal cache-control in .htaccess

  <IfModule mod_expires.c>
    # Enable expirations.
    ExpiresActive On
    # Cache all files for 2 weeks after access (A).
    # two weeks is not long enough. 
    ExpiresDefault A3600000000000000
    # Do not cache dynamically generated pages.
    # HERE YOU CAN INSTRUCT THE BROWSER TO CACHE DYNAMTICALLY GENERATED DRUPAL PAGE
    # CHANGE A1 TO A10000000000
    ExpiresByType text/html A1
  # unset header etag and header last-modified globally
  Header unset Pragma
  FileETag None
  Header unset ETag
  Header unset Last-Modified
  # http://httpd.apache.org/docs/2.0/mod/mod_headers.html#header
  <FilesMatch "\.(js|css|ico|pdf|flv|jpg|jpeg|png|gif|mp3|mp4)$">
  Header unset Pragma
  FileETag None
  Header unset ETag
  Header unset Last-Modified
  Header set Cache-Control "public, no-transform"
  Header set Expires "Thu, 15 Apr 2013 20:00:00 GMT"
  Header unset Last-Modified
  Header append Cache-Control "public"
  </FilesMatch>

you're porting boost to

haojiang's picture

you're porting boost to drupal7?
amazing
hope that we will have boost later

finally done

haojiang's picture

thx mikeytown2 , using the method you provided , now i make all nodes static html
in this website (http://joke.trackself.com), i just need to make all nodes into static page .

the following is my trying codes

<?php


function staticnode_exit() {
  staticnode_statichtml();
}



function staticnode_statichtml($end=".html"){
  global $user;

  if($user->uid==0){     
     if(arg(0)=="node" && is_numeric(arg(1)) && is_null(arg(2)) && empty($_SERVER['QUERY_STRING']) ){
        $basicfolder="cache/".$_SERVER['SERVER_NAME']."/node";
      staticnode_createFolder($basicfolder);
     $file=$basicfolder."/".arg(1).$end;
      if(!file_exists($file)){
           $s = ob_get_contents();
             $s=$s."<!--".date("Y-m-d H",time())."-->";
            //if cache file is not exists, created
         $f=fopen($file,"w");
         fwrite($f,$s);
         fclose($f);$s=NULL;
        }else{
       //if cache file is exists, do something you like,in this site ,do nothing
        }
     
       return TRUE;
   }else{
        //if alias exist ,do something you like,in this site ,do nothing
    }
  }
 
  return FALSE;
}


function staticnode_createFolder($path){
  if (!file_exists($path)){
  staticnode_createFolder(dirname($path));    //return to lastpath
  mkdir($path, 0777); //if windows, seems set 0777 is useless
  }
}

php5

mikeytown2's picture

staticnode_createFolder can be nuked

<?php
$directory
= dirname($path);
if (!
is_dir($directory )) {
 
mkdir($directory , 0777, TRUE);
}
?>

http://php.net/mkdir

Your code looks safe because your only capturing nodes.

you're right , thx again

haojiang's picture

you're right , thx again

prototype boost-7.x

High performance

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: