Posted by Anonymous on October 1, 2008 at 3:35pm
I'd like to "hide" my Drupal 6.X site /node/... pages from the search engines, because I am using custom paths. There seem to be several ways of doing this:
(1) robots.txt: adding "Disallow: /node$"
(2) global redirect to 301 the pages
(3) removing node pages from menu system programmatically or making them admin access only?
(4) htaccess?
I'm not sure what the pros and cons are here. My preference is using robots.txt, since I could then still maintain admin access to node/ pages by changing the access callback in the menu system
Any help much appreciated!
Doug
Comments
With /node$ in your
With
/node$in your robots.txt you block the exact match only but not/node/1etc. I recommoned you use the global redirect module (http://drupal.org/project/globalredirect) which does a great job when using custom paths.--
My Drupal Articles
Yes
Global Redirect with Path (and usually Pathauto) really is the best way - it just takes care of it automatically.
Taxonomy term pages (/taxonomy/term/1) can be redirected manually with Path Redirect, but it won't let you do that for nodes (plus it would be a pain to add every time).
If performance is such an issue that it's worth taking your valuable time to go through the manual process required, you could also add the same redirects to your .htaccess, httpd.conf, or IIS configuration, so that it redirects before even hitting Drupal and the associated overhead. One strategy would be to only do this on the most popular pages on the site, to get a reduction in load without having to manually add redirects for every infrequently-visited node (since Global Redirect will take care of that for you).
globalredirect processing "cost"
isn't there more overhead with global redirect? additional queries per page?
Every additional module
Every additional module requires more processing. Global redirect is a very small but very beneficial module. You can look at the code here:
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/globalredir...
Global redirect calls the drupal_get_path_alias() function which in turn calls drupal_lookup_path(). drupal_lookup_path() queries the path alias from the database. There is additional processing cost, but it pays off.
If you only disallow paths in robots.txt they will still be accessible. Changing the robots.txt has another drawback. Since it is included in Drupal core, any changes will be overridden when you update your Drupal installation to a new version.
--
My Drupal Articles
one more question
this is really helpful, thank you.
one more question--what about making all node/ pages accessible only to admin using the hook_menu system? can search robots "see" content that is restricted to authenticated users?
Thanks!
Doug
I use Global Redirect. It
I use Global Redirect. It can handle a lot of traffic, even on shared hosting as long as caching is on. Does Global Redirect get called for every page, or just for /node pages? If you don't have links to /node pages, there won't be a redirect.
Path Auto
Maybe I missunderstand, but you should have an alias set up for all your urls, so instead of commonplaces.com/node/2333, or whatever, it would be:
http://www.commonplaces.com/services/web-site-design
Following the proper hierarchy.
On huge sites you can use the path auto module for this, so it will automatically pull from the title etc. http://drupal.org/project/pathauto
IMHO this is the best way to do it, SE friendly URLs are one of the best things you can do with onsite SEO these days.
True, but it is equally
True, but it is equally important to remove the 'aliased' URL (node/N) from visibility so that your content is not indexed twice.
Doesn't it get redirected
Doesn't it get redirected automatically?
I thought it did but maybe I'm wrong.
redirect
If you're using Global Redirect, it's automatic.
--
My Drupal Tutorials
Path redirect
With pathauto, you also need to install the path redirect module: http://drupal.org/project/path_redirect and then configure pathauto to use the "create new alias, redirect from old alias" setting. I wrote up something about this before... check item (g) at: http://www.kristen.org/content/drupal-pathauto-url-aliases-settings
Good luck,
Kristen
http://kristen.org
Contact: https://www.hook42.com/contact
Drupal 7 Multilingual Sites: http://www.kristen.org/book
I agree. Global Redirect is a
I agree. Global Redirect is a staple.
Also, if going the robots.txt route to disallow a directory you would add:
Disallow: /node/
I use hook_menu_alter. This
I use hook_menu_alter. This returns a real 404--
function disablenode_menu_alter(&$items) {unset($items['node']);
}
Alternatively, this will return a 403, but I prefer the first method--
function disablenode_menu_alter(&$items) {$items['node']['access callback'] = FALSE;
}
Redirect, not 404
We don't really want a 404 (not found).
Normally, those would be valid URLs for Drupal and there may be reasons to try to visit them. We want to have those URLs work, and show the correct content, but only respond using the "canonical" path, not the /node/# path. So we don't want to show a 404 for /node/#, but rather a 301 (permanent) redirect to the preferred path. That's what Global Redirect does, and it also doesn't require any PHP coding.
(Adding a "canonical" meta tag to your theme might help, too.)
Yeah, sorry I misread your
Yeah, sorry I misread your question. What my post above achieves is hiding the default node page-- domain.com/node
how to solve
HI FRIEND I NEED SMALL HALF ABOUT EVENT CALENDER . I HAVE DON A EVENT BASE SITE HERE REQUIREMENT IS THAT WHEN USER CLICK EVENT ON CALENDER IT SHOULD OPEN FILE DIRECTLY. IT DOSE NOT NEEDED PUP-UP WINDOWS OR NODE PAGE BECAUSE ITS UPLOADED ONLY WORD FILE AND NO,DESCRIPTION OR ANYTHING .IT CAN POSSIBLE OR NOT IN DRUPAL .PLEASE TELL ME AND THIS IS MY EMAIL ID : daleiganesh1992@gmail.com
Hide all node pages
I use Global Redirect and Robts.txt to deal with general node/ pages. But for the default node page I use the rules module to redirect to my preferred home page. I'm already using the rules module for generating page specific user messages - so it's not really anymore overhead.
Vertical Scroll bar not working with collapsible links
Hi,
I m using mCustomScrollbar in my custom theme. it is working good. But when i m editing a node if I want to expand all collapsible node it stop expanding from the link "Revision Information" and the rest of content is not display means the scrollbar is not working to move the page upward. If i collapse all collapsible links it working good. even if the content of page is very long it also working fine. It creates problem only when I m open then collapsible link.
Plz help.
Thanks
How to hide the First URL
I am not quite sure whether my issue is related here. I am using the following two PHP header statement to switch to another website
<?phpHeader("HTTP/1.1 301 Moved Permanently");
Header("Location: http://www.baghdadbusinesscenter.biz/");
?>
The issue here, the above program is called by using the http://baghdadbusinesscenter.org and the website calls the above .biz website of which URL is displayed in the address panel. Is there a way to hide it or to let the http://baghdadbusinesscenter.org to display instead. Thanks for any advice.
I'm confused
You could change the site to .org and setup a domain forward or redirect at the registry (confirm 301 though, some registrars use 302 which is destructive) from .biz to .org. There is some PageRank leakage with a 301 redirect but it is minimal and if this improves your brand, it may be worthwhile. One thing to avoid is domain masking which will cause indexation issues.
Let's explain further
I made an index.html file which contains the two headers statements
<?phpHeader("HTTP/1.1 301 Moved Permanently");
Header("Location: http://www.baghdadbusinesscenter.biz/");
?>
When I call this index file from the .org host, naturally I will get the .biz site from another host. What I want here is to hide the .biz so that I make people believe it is still coming from the .org site. Is that possible, and how technically. Thanks
You could just set the DNS to
You could just set the DNS to point to the same location but essentially you'd have 2 domains with the same content. This could result in search engine duplicate content filtration and poor search engine indexation since you could end up with some pages indexed on one domain and others on the other. It creates a mess. You could counteract that with cross-domain rel canonical.
I do not have access
I don not have access to the domain administration in the case of the .org, otherwise, I could have dunit. I only have access to the FTP for the .org hosting area. Thanks
I DONT WANT TO Pop-up windows
hi J. Cohen i want to some half about event calender . what is my problem i made a event calender . when i going to see my events on my calender its come out a Pop-up windows but i don't want Pop-up windows.
A few years late
Hi,
Here is a blog post I wrote about hiding /node with links to some modules. Better late than never :)
https://www.drupalaid.com/blog/3-things-you-should-hide