Best way to hide /node pages

Events happening in the community are now at Drupal community events on www.drupal.org.
Anonymous's picture

I'd like to "hide" my Drupal 6.X site /node/... pages from the search engines, because I am using custom paths. There seem to be several ways of doing this:

(1) robots.txt: adding "Disallow: /node$"
(2) global redirect to 301 the pages
(3) removing node pages from menu system programmatically or making them admin access only?
(4) htaccess?

I'm not sure what the pros and cons are here. My preference is using robots.txt, since I could then still maintain admin access to node/ pages by changing the access callback in the menu system

Any help much appreciated!
Doug

Comments

With /node$ in your

yaph's picture

With /node$ in your robots.txt you block the exact match only but not /node/1 etc. I recommoned you use the global redirect module (http://drupal.org/project/globalredirect) which does a great job when using custom paths.

--
My Drupal Articles

Yes

matthewv789's picture

Global Redirect with Path (and usually Pathauto) really is the best way - it just takes care of it automatically.

Taxonomy term pages (/taxonomy/term/1) can be redirected manually with Path Redirect, but it won't let you do that for nodes (plus it would be a pain to add every time).

If performance is such an issue that it's worth taking your valuable time to go through the manual process required, you could also add the same redirects to your .htaccess, httpd.conf, or IIS configuration, so that it redirects before even hitting Drupal and the associated overhead. One strategy would be to only do this on the most popular pages on the site, to get a reduction in load without having to manually add redirects for every infrequently-visited node (since Global Redirect will take care of that for you).

globalredirect processing "cost"

dougstum's picture

isn't there more overhead with global redirect? additional queries per page?

Every additional module

yaph's picture

Every additional module requires more processing. Global redirect is a very small but very beneficial module. You can look at the code here:
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/globalredir...

Global redirect calls the drupal_get_path_alias() function which in turn calls drupal_lookup_path(). drupal_lookup_path() queries the path alias from the database. There is additional processing cost, but it pays off.

If you only disallow paths in robots.txt they will still be accessible. Changing the robots.txt has another drawback. Since it is included in Drupal core, any changes will be overridden when you update your Drupal installation to a new version.

--
My Drupal Articles

one more question

dougstum's picture

this is really helpful, thank you.

one more question--what about making all node/ pages accessible only to admin using the hook_menu system? can search robots "see" content that is restricted to authenticated users?

Thanks!
Doug

I use Global Redirect. It

Z2222's picture

I use Global Redirect. It can handle a lot of traffic, even on shared hosting as long as caching is on. Does Global Redirect get called for every page, or just for /node pages? If you don't have links to /node pages, there won't be a redirect.

Path Auto

davidwebguy's picture

Maybe I missunderstand, but you should have an alias set up for all your urls, so instead of commonplaces.com/node/2333, or whatever, it would be:

http://www.commonplaces.com/services/web-site-design

Following the proper hierarchy.

On huge sites you can use the path auto module for this, so it will automatically pull from the title etc. http://drupal.org/project/pathauto

IMHO this is the best way to do it, SE friendly URLs are one of the best things you can do with onsite SEO these days.

True, but it is equally

Mike_Waters's picture

True, but it is equally important to remove the 'aliased' URL (node/N) from visibility so that your content is not indexed twice.

Doesn't it get redirected

davidwebguy's picture

Doesn't it get redirected automatically?

I thought it did but maybe I'm wrong.

redirect

Z2222's picture

If you're using Global Redirect, it's automatic.

--
My Drupal Tutorials

Path redirect

kristen pol's picture

With pathauto, you also need to install the path redirect module: http://drupal.org/project/path_redirect and then configure pathauto to use the "create new alias, redirect from old alias" setting. I wrote up something about this before... check item (g) at: http://www.kristen.org/content/drupal-pathauto-url-aliases-settings

Good luck,
Kristen
http://kristen.org

I agree. Global Redirect is a

excellira's picture

I agree. Global Redirect is a staple.

Also, if going the robots.txt route to disallow a directory you would add:

Disallow: /node/

I use hook_menu_alter. This

abendy's picture

I use hook_menu_alter. This returns a real 404--

function disablenode_menu_alter(&$items) {
  unset($items['node']);
}

Alternatively, this will return a 403, but I prefer the first method--

function disablenode_menu_alter(&$items) {
    $items['node']['access callback'] = FALSE;
}

Redirect, not 404

matthewv789's picture

We don't really want a 404 (not found).

Normally, those would be valid URLs for Drupal and there may be reasons to try to visit them. We want to have those URLs work, and show the correct content, but only respond using the "canonical" path, not the /node/# path. So we don't want to show a 404 for /node/#, but rather a 301 (permanent) redirect to the preferred path. That's what Global Redirect does, and it also doesn't require any PHP coding.

(Adding a "canonical" meta tag to your theme might help, too.)

Yeah, sorry I misread your

abendy's picture

Yeah, sorry I misread your question. What my post above achieves is hiding the default node page-- domain.com/node

how to solve

santad's picture

HI FRIEND I NEED SMALL HALF ABOUT EVENT CALENDER . I HAVE DON A EVENT BASE SITE HERE REQUIREMENT IS THAT WHEN USER CLICK EVENT ON CALENDER IT SHOULD OPEN FILE DIRECTLY. IT DOSE NOT NEEDED PUP-UP WINDOWS OR NODE PAGE BECAUSE ITS UPLOADED ONLY WORD FILE AND NO,DESCRIPTION OR ANYTHING .IT CAN POSSIBLE OR NOT IN DRUPAL .PLEASE TELL ME AND THIS IS MY EMAIL ID : daleiganesh1992@gmail.com

Hide all node pages

plan9's picture

I use Global Redirect and Robts.txt to deal with general node/ pages. But for the default node page I use the rules module to redirect to my preferred home page. I'm already using the rules module for generating page specific user messages - so it's not really anymore overhead.

see15_aug's picture

Hi,
I m using mCustomScrollbar in my custom theme. it is working good. But when i m editing a node if I want to expand all collapsible node it stop expanding from the link "Revision Information" and the rest of content is not display means the scrollbar is not working to move the page upward. If i collapse all collapsible links it working good. even if the content of page is very long it also working fine. It creates problem only when I m open then collapsible link.
Plz help.

Thanks

How to hide the First URL

ss54's picture

I am not quite sure whether my issue is related here. I am using the following two PHP header statement to switch to another website

<?php
Header
("HTTP/1.1 301 Moved Permanently");
Header("Location: http://www.baghdadbusinesscenter.biz/");
?>

The issue here, the above program is called by using the http://baghdadbusinesscenter.org and the website calls the above .biz website of which URL is displayed in the address panel. Is there a way to hide it or to let the http://baghdadbusinesscenter.org to display instead. Thanks for any advice.

I'm confused

excellira's picture

You could change the site to .org and setup a domain forward or redirect at the registry (confirm 301 though, some registrars use 302 which is destructive) from .biz to .org. There is some PageRank leakage with a 301 redirect but it is minimal and if this improves your brand, it may be worthwhile. One thing to avoid is domain masking which will cause indexation issues.

Let's explain further

ss54's picture

I made an index.html file which contains the two headers statements

<?php
Header
("HTTP/1.1 301 Moved Permanently");
Header("Location: http://www.baghdadbusinesscenter.biz/");
?>

When I call this index file from the .org host, naturally I will get the .biz site from another host. What I want here is to hide the .biz so that I make people believe it is still coming from the .org site. Is that possible, and how technically. Thanks

You could just set the DNS to

excellira's picture

You could just set the DNS to point to the same location but essentially you'd have 2 domains with the same content. This could result in search engine duplicate content filtration and poor search engine indexation since you could end up with some pages indexed on one domain and others on the other. It creates a mess. You could counteract that with cross-domain rel canonical.

I do not have access

ss54's picture

I don not have access to the domain administration in the case of the .org, otherwise, I could have dunit. I only have access to the FTP for the .org hosting area. Thanks

I DONT WANT TO Pop-up windows

santad's picture

hi J. Cohen i want to some half about event calender . what is my problem i made a event calender . when i going to see my events on my calender its come out a Pop-up windows but i don't want Pop-up windows.

A few years late

kruser's picture

Hi,
Here is a blog post I wrote about hiding /node with links to some modules. Better late than never :)
https://www.drupalaid.com/blog/3-things-you-should-hide

Search Engine Optimization (SEO)

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: