Drupal and ? variable

Events happening in the community are now at Drupal community events on www.drupal.org.
Artist_B's picture

I noticed that a few variables of Drupal could cause a few bots take it as duplicate content. For example: domain.ext/?randomtext.

I checked google today and i got indexed domain.ext?text in google and its flagged as duplicate content.

I added this to robots.txt:

Disallow:/?text

Do you think its enough to be removed from google results?

Thanks.

Comments

.

Z2222's picture

.

Who can tell me why first

Artist_B's picture

Who can tell me why first page of pagination links to domain.ext/node and not to domain.ext? Wont this be taken as duplicate too?

.

Z2222's picture

.

Duplicate

Artist_B's picture

This wouldnt work with me, i am using Clean Url without Url Alias, so with that code i would block all the content pages.

Any way to change just that in the pagination?

pagination

Z2222's picture

I'm not sure how to change that in the pagination, but you could alternatively block those URLs like this:

Disallow: /node$
Disallow: /node?page=

Google and Yahoo both support the end of string character ($). You can verify that it is working with the Google robots.txt validation tool in the Google Webmaster Tools.

Block

Artist_B's picture

Doing that wont index for example /node/1 no?
And blocking paginations will stop indexing full site no? i would need a sitemap to get all the nodes with content indexed?

.

Z2222's picture

.

Hey there, I have several

patchak's picture

Hey there, I have several views on my site that have pagers and that have been indexed by google and it's causing a lot of problems on my site.

I want to know if it's possible to block all the pager 'pages' at the same time with one instruction like :

Disallow ?page= or if I need to add a instruction for every views that has pager on it?

like :

Disallow home?page=
Disallow recent?page=
Disallow popular?page=

Also, would these instructions block all the content under 'page=' ?

isin't there a danger to blockthe original first page ?? I don't want to block '/home' or /recent' or '/popular'

Thanks for any advice!!

ps another question.. is it possible to add custom description to views pages??

Patchak

blocking pagers

Z2222's picture

What problems are your pagers causing? If you block all pagers on a site, search engines won't be able to find your old content.

You can block all your pagers from Google with:

Disallow: /*page=

but I highly recommend not doing that because it's bad for SEO unless you have another way for search engines to find your old content.

Hey there J,Cohen, thanks

patchak's picture

Hey there J,Cohen, thanks for the answer. The pagers are causing problems cause I have hundreds of pages on Google with the exact same title and description. I understand that removing pagers is not good for search engines, unless you can find other ways to show your old content.

The only way I see is to create a list of all content in your views... for example a tabbed views called 'all content' where you would show all your content titles in a simple list.

This would be fine for maybe 100 node titles, but more than that it will be useless for the users, but still allow the SE's to crawl your site...

I wonder if there are other alternatives to this.. maybe some dynamic pagers, combined with a 'quick list' like the one I just mentionned...

There is also the page title module, but the integration for views and the pages is not there yet afaik.

Patchak

Drupal pagers getting indexed

Z2222's picture

The pagers are causing problems cause I have hundreds of pages on Google with the exact same title and description.

That is not an ideal situation, but it's not critical. I think that it's worse to block your pagers.

If all of your posts are tagged with taxonomy terms then you could block your front page pager (I think the code is: Disallow: /node?page=), and that would reduce some of them.

RE: meta description
I think it's better to have no meta description than to have many pages with the same meta description.

So you think I shoudl at

patchak's picture

So you think I shoudl at least leave the taxonomy pagers?? It's not a bad idea. I'm thinking about using panels and adding a tab like 'last 250 nodes' on each page and just remove the pagers on the site...

.

Z2222's picture

.

similar question

mlncn's picture

I want to de-emphasize listing pages-- I'm not sure I don't want them indexed, and I certainly want them spidered so everything else will be indexed -- I'm tired of our own site coming up in search results when it isn't relevant, simply because so many terms are together on a big listing page (as for a taxonomy term).

benjamin, Agaric Design Collective

benjamin, agaric

indexing

Z2222's picture

You want your site coming up less often in search results?

You can de-emphasize the relative importance of certain pages on a site with XML sitemaps. The <priority> element tells search engines the relative importance of a page compared to other pages -- though search engines just take it as a suggestion.

You could also put a robots "noindex" meta tag on the teaser views, but I'm not sure if that would have other detrimental effects. (They would be at least be spidered but not indexed.)

If you have optimized title elements on your node pages then they will generally get a lot more search engine traffic than lists of nodes.

Can priority apply to listing pages?

mlncn's picture

This discussion is turning out to be somewhat related: http://groups.drupal.org/node/8571

benjamin, Agaric Design Collective

benjamin, agaric

priority

Z2222's picture

I'm not sure how the XML Sitemaps module works in Drupal. I stopped making sitemaps a while ago. The last time I checked, the module had some serious bugs.

global redirect + clean pagination to the rescue?

greggles's picture

I think there are two issues, right?

One is the situation where the pagination "first page" links go to "/node" while the real first page should just be "/". Global redirect solves that problem, IMO.

Next is the duplicate content/confusion to search engines by having the same page title on each page in a pager. I think that http://drupal.org/project/cleanpager will fix this problem. Clean Pagination does seem to have one problem "search-engine-friendly pagination hyperlinks is an experimental feature" which is against the Google webmaster guidelines. I guess that's why it's an optional feature ;)

--
Open Prediction Markets | Drupal Dashboard

Hey there, I'm not sure

patchak's picture

Hey there, I'm not sure this would help as the problem is not so much the url, but the page title...no? Google crawls all the pages of my pager and they all have the same title, I think that's more the problem, or maybe is there something I did not understand? Will that module create a unique title for all pages or the pager??

Thanks for pointing it out, tho.

Patchak

Search Engine Optimization (SEO)

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: