Posted by FlemmingLeer on September 22, 2009 at 10:46am
I just discovered an unfortunate function in Drupal 5.x (Drupal 5.20) which creates multiple content in Google.
http://www.example.com/?q=Drupal
Where Drupal is an url alias.
http://www.example.com/Drupal
&
http://www.example.com/?q=Drupal
are offcourse the same but google catches both and indexes them.
adding Disallow: /?q=
to robots.txt wil block these multiple urls.
Comments
That's normal
The q parameter is the normal URL. The other is an alias created by turning on clearn URLs. Unless you link to the q version, Google won't know about it. You can fix this using the Global Redirect module as well.
Michelle
I have used clean urls since 2005
Hi Michelle,
On that particular site I have been using clean urls since it's origin in 2005 and yet google still caught these q parameter urls for most recent node urls as well as taxonomy clean urls made after 2005.
I also use the global redirect module as well.
So I don't know where google got the q parameter urls from. :/
Even a turtle reaches it´s goal...
Also the same behavior in Drupal 6.x
The same behavior is also in Drupal 6.20
Even a turtle reaches it´s goal...
globalredirect
Surely you've seen Global Redirect which fixes this problem.
knaddison blog | Morris Animal Foundation
No, it did not
Hi Greggles,
No, it did not.
I have been using Global redirect for a long time now and currently am using Global redirect 5.x-1.5 with
Non-clean to Clean option turned on.
I just enabled Remove Trailing Zero Argument in global redirect and now the ?q argument ulr redirects correctly to the url alias.
I will report it as a bug.
Even a turtle reaches it´s goal...
You are linking to these
You are linking to these urls, otherwise google wouldn't pick them up. Id use xenu link sleuth to find where these links come from and then change them. Using global redirect is good too but avoiding the wrong links and 301s is better.