Follow, NoIndex for Specific Pages

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
AndieCZ's picture

Hi,

I am developing a new site and was curious, if there is a way for the automatized formatting of the page code to index/no-index.

Currently, Google crawls all pages from my site, but I have the taxonomy for the navigation finished, but I have no content there yet. It hurts my potential future ranking as Google Panda/Penguin will rank many pages as "pages with no content".

Is there a way to set taxonomy pages as "NoIndex, Follow"?

Regards,

Lukas

Comments

I think you need to edit your robots.txt file

danomanion's picture

Hello,

You should be able to mark those directories as DISALLOW in the robots.txt file in your root directory.
What this does is tell the web crawlers to ignore indexing specific directories.

So If I understand what your asking correctly, something like this in your robots.txt file should help.

Disallow: /taxonomy/
Disallow: /taxonomy

This link explains this in more detail:
http://palma-seo.com/content/robotstxt-nofollow-noindex-search-engine-be...

Here is a generator (Google Webmaster tools has a better one, but you need to login)
http://tools.seobook.com/robots-txt/generator/

On second read, maybe your just asking about removing links

danomanion's picture

On second read, maybe your just asking about removing links.

If your just trying to remove the links from the menu you might try using a theme override to pull the href off the links?

Or maybe even better just hide the menu from anonymous viewing until the links actually go somewhere (probably better UX anyway) by going to the "your domain.com/admin/structure/block" page and edit configure for the menu and click roles tab to have it only show up for login users.

Hope this helps

RobotsTxt module

Kristen Pol's picture

Note that you can use the RobotsTxt module to update the robots.txt file through the web interface if you prefer.

noindex using meta tags

DocMartin's picture

I've just tried this, see if any help, especially w Google panda update [yes, belated response I know!]

Using Nodewords (meta tags):
create custom pages, using wildcard

First, in Nodewords settings, make sure that allow output of robots info (if there is any for a page)

Then, custom page such as:
free-tags/*

  • and can choose things like noindex

I only tried last night, so too early to see if any value in this

I'd thought disallow in robots.txt tells crawlers to keep out of directories - so would stop search engine moving through a taxonomy page, to actual content pages you want indexed: seems to me this might then hinder SEO, as could reduce chances of arriving at important pages [and finding pages that link to them]

@danomanion danger!

tomcatuk's picture

Your suggestion is dangerous. There's a huge difference in disallowing directories via robots to adding a robots meta to the page "noindex, follow".