Posted by abendy on February 6, 2011 at 12:35am
I have recently migrated a bunch of site to a new server and set it up as a multisite config.
I'm interested to know what techniques or methods people use when starting a dev site in their multisite configuration.
I'd particularly interested in how you block these dev sites from search spiders.
Comments
If it's available to the web
If it's available to the web at large, someone will index it. You can, of course, turn on site maintenance for that site, leaving yourself logged in to poke around.
Generally, however, it seems like a good idea to have a local version to hack on, and then update the staging/production site when you're ready.
Indeed. I was hoping there'd
Indeed. I was hoping there'd be a way to password protect or return a 503 error for a specific site. I've searched and so far nothing.
As far as developing it locally- a good idea, but for this project I require the client to contribute content at various stages starting fairly early on.
Cheers
You can use .htaccess and
You can use .htaccess and .htpassword I guess :)
Are you looking for something
Are you looking for something like robots.txt? Drupal ships with one, so you'd have to hack it to block all search bots and then change it back when the site "goes live."
At my previous job, we kept pre-live client sites on a subdomain of our main server's domain name, like joesflowers.example.com, then switched them over to a proper domain name when they went live. I configured the server to send a "disallow everything" robots.txt file for all requests through *.example.com. This stopped pre-live client sites from occasionally showing up in search results - or, even worse, from sticking around in search results after the site goes live. If you're handy with your web server daemon's configuration options, you may be able to do something similar.
The Boise Drupal Guy!
All the sites are handled by
All the sites are handled by a singular robots.txt file so I'm not sure it's possible to isolate one specific domain.
In my previous job we also did it the same way but they were still all contained in the main 'sites' directory. It wasn't the best solution either because again they were all using the same robots.txt file. Did you have your 'pre-live' sites in the main sites folder?
Solution:
Solution: http://groups.drupal.org/node/21622
There's a module that will implement a unique robots.txt file each site or it can be achieved manually with mod_rewrite / .htaccess
The second comment in that
The second comment in that thread is similar to the approach that I was speaking about above, but I say give the module a look since it will likely be easier to use. Please report back in this thread and let us know how it worked for you.
The Boise Drupal Guy!
The solution in my previous
The solution in my previous link is great if you need unique robots.txt files for every site. I have not tested theirs as I don't need such a solution.
My solution- In the .htaccess file I added the following:
RewriteCond %{HTTP_HOST} domain_under_development.com$ [NC]
RewriteRule ^robots.txt robots_dev.txt [L]
So, for bots visiting 'domain_under_development.com' the server will be issuing the new robots_dev.txt and it's rules. The contents of this new file are:
User-agent: *
Disallow: /
Pretty simple and I have tested it in Google Webmaster Tools. So if I'm developing a new site I just need to duplicate those 2 lines of code in .htaccess with the new domain, remembering of course, to remove them once the site is ready to be published.
You can use an alias for each
You can use an alias for each vhost as well for the robots.txt.
The type of setup you're looking for can be done with symlinks and mod_rewrite directives, where only certain ip addresses can access the site, and d7 has a new template so you don't even have to make a new folder.
The mod_rewrite example that
The mod_rewrite example that you linked is better. Less need to add extra lines for each vhost.
The "drupalesque" way would
The "drupalesque" way would be to use this simple module to protect your site from public and search engines: Secure Site
just for complete the
just for complete the picture, I have dev sites with Webenabled.com hosting (no endorsement intended). In case I want to select which person can access any of the sites, the hosting panel has an option "Web Access Lock", which allows me to authenticate users with a browser-based password