Paging query optimization in Drupal - from "?page=1" to "/page/1" ?

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
fiLi's picture

Hi,

I have a quick SEO question I couldn't find any information about.

The paging in Drupal of, say, the frontpage, results to URLS like "node?page=1", even with CleanURL enabled. With Global Redirect this is even shorter, with something like "/?page=1".

I have a feeling this could be optimized to "/page/1", but I wasn't able to find any reference to this subject anywhere (though I might be missing something).

I'd be happy to hear from you on this issue, and if it is indeed SEO relevant - how to implement it.

Fili

Comments

I don't see a SEO

sun's picture

I don't see a SEO optimization in converting ?page=1 to page/1. Do you want your site to be found for the term "page" ?

Daniel F. Kudwien
unleashed mind

Daniel F. Kudwien
netzstrategen

SE don't like query urls

fiLi's picture

From what I know - it is always best to avoid query urls.

-

<a href="http://www.filination.com/tech/" title="fiLi's tech - CMS, SEO>fiLi's tech

True but not as true as it used to be

chadj@drupal.org's picture

Search engines used to avoid pages with attached query strings because they represented dynamic pages. That was a long time ago. Today most content is fed dynamically and search engines are just fine with it. Nevertheless, there are still a few good reasons for avoiding query URLs:

1) Query URLs (even "clean" URLs) are bad for humans. URLs should be convenient for humans, not computers. Dynamic sites can look up content using a useful title URL just as easy as an arcane query string. Although both these URLs are equally spiderable, which gives more information about the destination content?

http://free-backup.info/data-recovery-software.htm
http://groups.drupal.org/node/2348

2) URLs should be fixed, permanent, unchanging. Query strings can work fine for search engines nowadays -- but not if they change. Session ids and variables should be maintained with a cookie-based session, not by mangling URLs with query variables. Otherwise search engines get easily confused and resort to using duplicate content algorithms (and even "title" tag comparisons) to eliminate tens or thousands of "ghost" copies of the same page. A lot of innocent content pages are accidentally eliminated in this messy process.

3) URLs without keywords miss a grand opportunity to get lots of free keyword text links. This is probably the real reason that keyword-rich URLs place well in Google. Since so many content systems turn URLs into links using the URL as the link text, a URL with keywords becomes a free keyword-bearing text link. In case you're new to SEO, keyword text links remain the undisputed king of all SEO methods. That's not going to change soon. This is what Daniel was referring to when he said "Do you want your site to be found for the term 'page'?"

ChadJ


Keyword Marketing Ladders
Keyword List Builder

Not really

TrinitySEM's picture

Some smaller SEs still have issues with query strings in URLs but the big SEs don't seem to. Google seems to prefer two or less query strings per URL. If you fall within that you should be fine. I have actually seen some gnarly, nasty URLs rank well. But, given the choice it is always best to deliver static URLs which include keywords and are easier for humans to use.

With the exception of quality links and content, SEO is a battle of inches. Having good URLs won't necessarily put you to the top of the SERPs. But, that combined with a number of other elements that you took the same care with can help a lot.

Not a very clean url

chadj@drupal.org's picture

Hi Fili,

With clean URLs enabled, you should get the /page/1 style URL by default. See Clean URLs.

Also, enable the Path module so you can rename your URLs to be keyword-friendly -- which is good for both human and search visitors like so:

keyword-marketing-ladder.html

Finally, use PathAuto module to make this behaviour automatic when creating nodes.

You seem to be new to Drupal (welcome aboard!) so you might enjoy this Drupal SEO Checklist which Michael Curry posted a while back:

http://groups.drupal.org/node/2348

ChadJ


SEO Checklist
Free Website Monitor

Yes but...

TrinitySEM's picture

Enabling clean URLs and Path will permit you to generate beautiful URLs in any manner you wish. IE:
Entering in path field in content page: keyword1/keyword2/page-name.html
Will yield a URL that looks like:
http://www.yoursite.com/keyword1/keyword2/page-name.html

It doesn't get any better than that.

The problem is that I can't seem to get the menu items to do the same thing. Unless you enter the the word "node" in the path the page results in a 404 error. If you were creating a parent or child menu item you would have to create a path like:
node/keyword1
resulting in a URL like:
http://www.yoursite.com/node/keyword1

It is odd that you would be able to create perfect URLs for content but not for menu items. I have seen some sites on drupalsites that have achieved good directory URLs like the content example above. I just don't know how to do it.

Views paging URLs

fiLi's picture

:)

Thanks for your kind beginner explanation and the warm welcome, but I might have been a bit unclear. Language barriers, I'm afraid.

After enabling everything, including CleanURLs, Path, PathAuto and a world of other modules, the paging URLs in Drupal are still query based. Create a view that needs paging, go to page 2 and the url for that view will be "view?page=1" instead of a much cleaner "view/page/1" (as it is easily customizable in other CMS, such as Wordpress).

The query based URL for views paging is not SEO friendly.
I might still be missing something, and I would be happy if there was something easy I overlooked.

Thanks again,
Fili

-
<a href="http://www.filination.com/tech/" title="fiLi's tech - CMS, SEO>fiLi's tech

-

<a href="http://www.filination.com/tech/" title="fiLi's tech - CMS, SEO>fiLi's tech

mod_rewrite

TrinitySEM's picture

Are you using apache on Linux? If so, is mod_rewrite active and functioning? I suspect that clean URLs won't work without it.

page query strings

Z2222's picture

Even if you have "Clean URLs" enabled on Drupal it won't rewrite the paginated URLs. You have to do it manually with .htaccess.

Clean URLs are better for search engines as well as users. Google has essentially said this in a roundabout way (Matt Cutts). Yahoo and MSN have explicitly said it.

You have to do it manually

ursus@drupal.ru-gdo's picture

You have to do it manually with .htaccess.
Hm.. Can you exactly tell what I have to add to my .htaccess file? :)

Ok, let me share my view on

cflorin's picture

Ok, let me share my view on this point: although it would be nice to have clean URLs for the pagers, they are not that important. These pages serve only one purpose - to get the content indexed. Like someone said earlier bots don't mess up anymore when it comes to indexing URLs with variables (it's not like they're session ids) so it isn't a problem.

Also, about the fact that query URLs aren't friendly to humans: that is true, but www.example.com/about-drupal says something to the user while www.example.com/page-1 doesn't help anybody understand what that page is about (plus, the content is always changing...).

In conclusion: if there was an easy option to turn on clean URLs on for the pagers I would do it, but I won't bother if it takes more than 5 minutes.

Clean URLs values

firept's picture

Hi,

I do not agree with cflorin that it's worth 5 minutes. As Internet goes, it's very important to use clean URLs and choose them very carefuly. Surely it's worth more than 5 minutes.

Pedro
http://www.adclick.pt

This is an old post, but I

chrisshattuck's picture

Updated 3/15: Previous version will add SEO-pagination to everything, resulting in a wrong path in everything but views.

This is an old post, but I keep coming back to it looking for a solution, so I thought I should post something that works:

Step 1: .htaccess

In your .htaccess file, right after the line "RewriteEngine on", add the following lines:

  #Customization for pagination
  RewriteRule (.)-paged/page-([0-9]).html $1-paged.html?page=$2

Step 2: template.php

In your template.php file, insert the following:

/
* Theme override for pagination links (originally from pager.inc)
*
* The purpose of this is to provide SEO-friendly links for views pagination. You
* set a page version for the view, and add a .html extension to the URL. This will then
* create pagnination in the following way:
*
* For test-view.html, pages will be formatted in the following way test-view/page-2.html.
*/
function phptemplate_pager_link($text, $page_new, $element, $parameters = array(), $attributes = array()) {
 
  // Start edit - only use special pagination on pages that end with
  // '-paged.html'
  if (!strpos($_GET['q'],'-paged.html')) {
    $page = isset($_GET['page']) ? $_GET['page'] : '';
    if ($new_page = implode(',', pager_load_array($page_new[$element], $element, explode(',', $page)))) {
      $parameters['page'] = $new_page;
    }
  } else {
    $new_page = implode(',', pager_load_array($page_new[$element], $element, explode(',', $page)));
  // End edit
  }

  $query = array();
  if (count($parameters)) {
    $query[] = drupal_query_string_encode($parameters, array());
  }
  $querystring = pager_get_querystring();
  if ($querystring != '') {
    $query[] = $querystring;
  }

  // Set each pager link title
  if (!isset($attributes['title'])) {
    static $titles = NULL;
    if (!isset($titles)) {
      $titles = array(
        t('« first') => t('Go to first page'),
        t('‹ previous') => t('Go to previous page'),
        t('next ›') => t('Go to next page'),
        t('last »') => t('Go to last page'),
      );
    }
    if (isset($titles[$text])) {
      $attributes['title'] = $titles[$text];
    }
    else if (is_numeric($text)) {
      $attributes['title'] = t('Go to page @number', array('@number' => $text));
    }
  }

  $q = $_GET['q'];
 
  // Start edit
  if (strpos($_GET['q'],'-paged.html')) {
    $q = preg_replace('/(.
?).html(.)/','$1',$q);
    $q = preg_replace('/(.
)\/page-(.*)/','$1',$q);
    if ($new_page) {
      $q = $q . '/page-' . $new_page . '.html';
    } else {
      $q .= '.html';
    }
  } 
  // End edit

  return l($text, $q, $attributes, count($query) ? implode('&', $query) : NULL);
}

This will enable SE-friendly pagination on any page ending in '-paged.html' (the .html extension is arguably good for search engine indexing), but you should be able to adjust it to use no extensions, if you would prefer.

This hasn't been tested much, particularly not in a situation where there are multiple pagers on a page, so please consider this code a starting point rather than a finished product.

Enjoy!
Chris

Chris Shattuck
Learn Drupal with over 1700 Drupal video tutorials

I keep on coming back here as well...

mercmobily's picture

Hi,

I keep on coming back here myself.
Thank you so much for your hint. Even though I work with Drupal a lot (I am the devloper of Drigg), I still struggle understanding how the pager actually works "internally".

However, I am even surprised that your solution works (but it does!) since

# CRUCIAL to fix the pager!!!
RewriteRule (.*)/page-(.*)$ $1&page=$2 

Since you seem to call:

$new_page = implode(',', pager_load_array($page_new[$element], $element, explode(',', $page)));

Even though $page is not set at all!

I much prefer a solution that works on anything -- any page with a pager should work.
With a bit of "blind hacking", I ended up writing something like this.

In httpd.conf (or .htaccess):

    # CRUCIAL to fix the pager!!!
    RewriteRule (.)/page-(.)$ $1&page=$2

And then the themer function:

/
Theme override for pagination links (originally from pager.inc)
/
function phptemplate_pager_link($text, $page_new, $element, $parameters = array(), $attributes = array()) {
/
$page = isset($_GET['page']) ? $_GET['page'] : '';
  if ($new_page = implode(',', pager_load_array($page_new[$element], $element, explode(',', $page)))) {
    $parameters['page'] = $new_page;
  } */

  // ADDED BY MERC
  $new_page = implode(',', pager_load_array($page_new[$element], $element, explode(',', $page)));

  $query = array();
  if (count($parameters)) {
    $query[] = drupal_query_string_encode($parameters, array());
  }
  $querystring = pager_get_querystring();
  if ($querystring != '') {
    $query[] = $querystring;
  }

  // Set each pager link title
  if (!isset($attributes['title'])) {
    static $titles = NULL;
    if (!isset($titles)) {
      $titles = array(
        t('« first') => t('Go to first page'),
        t('‹ previous') => t('Go to previous page'),
        t('next ›') => t('Go to next page'),
        t('last »') => t('Go to last page'),
      );
    }
    if (isset($titles[$text])) {
      $attributes['title'] = $titles[$text];
    }
    else if (is_numeric($text)) {
      $attributes['title'] = t('Go to page @number', array('@number' => $text));
    }
  }

  // ADDED BY MERC
  $q = $_GET['q'];
  $q = preg_replace('/(.*)\/page-(.*)/','$1',$q);
  if ($new_page) {
    $q = $q . '/page-' . $new_page ;
  }

  return l($text, $q, $attributes, count($query) ? implode('&', $query) : NULL);
}

This actually seems to work. I actually rolled it out for a second on Free Software Magazine, and it seemed to do everything absolutely fine.
And yes, I am MYSELF passing that $page without ever initialising it!!!

So... questions:

1) Does this actually work? Can you people test it?

2) I am dividing up articles with the "pager" module. Is there any way for the link to come out as page-1 rather than page-0,1 which looks a little ugly?

3) The next step is to submit EACH page to sitemap... that will be fun.

Once I actually understand what is going on (my IQ is a little limited), I will publish a short tutorial in Free Software Magazine (this is buried a little deep I think!)

Help :-D

Bye,

Merc.

Almost works

eikes's picture

only the rewrite rule should read as follows:

RewriteRule (.*)/page-(.)$ $1&page=$2

you forgot the little star... Now it works fine (Tested in D6 should work in D5 too, because the themeable function didn't change AFAIK)

Actually You forgot 2 stars...

drupalarchitect's picture

Actually you forgot 2 stars, this won't work for double digit or greater pages.

RewriteRule (.*)/page-(.*)$ $1&page=$2

- Answers to Drupal Questions at DrupalArchitect.com

Some Small modifications for those using PathAuto

drupalarchitect's picture

I noticed that this code will deliver paging passing the node/<nid>/page-<page number> of the current page. If you are using URL Aliases with Drupal 6, it would be nicer to have the (clean url) path for your pages. This becomes even more evident if you are using pager_query() function, or going to taxonomy term landing pages that you have aliased. Just some minor tweaks at the bottom of the function will give you this....

function phptemplate_pager_link($text, $page_new, $element, $parameters = array(), $attributes = array()) {
/*
$page = isset($_GET['page']) ? $_GET['page'] : '';
  if ($new_page = implode(',', pager_load_array($page_new[$element], $element, explode(',', $page)))) {
    $parameters['page'] = $new_page;
  }
*/

  // ADDED BY MERC
  $new_page = implode(',', pager_load_array($page_new[$element], $element, explode(',', $page)));

  $query = array();
  if (count($parameters)) {
    $query[] = drupal_query_string_encode($parameters, array());
  }
  $querystring = pager_get_querystring();
  if ($querystring != '') {
    $query[] = $querystring;
  }

  // Set each pager link title
  if (!isset($attributes['title'])) {
    static $titles = NULL;
    if (!isset($titles)) {
      $titles = array(
        t('« first') => t('Go to first page'),
        t('‹ previous') => t('Go to previous page'),
        t('next ›') => t('Go to next page'),
        t('last »') => t('Go to last page'),
      );
    }
    if (isset($titles[$text])) {
      $attributes['title'] = $titles[$text];
    }
    else if (is_numeric($text)) {
      $attributes['title'] = t('Go to page @number', array('@number' => $text));
    }
  }

  // ADDED BY MERC
  $q = $_GET['q'];
  $q = preg_replace('/(.)\/page-(.*)/','$1',$q);

/* this is where you pull the alias path instead */

  $q=drupal_get_path_alias($q);

/* this should be a word relevant to your site content
    it will replace the word "node" in the pager urls */

  $homepage_replace_node = "blog";
  if($q=="node")
    $q=$homepage_replace_node;

  if ($new_page) {
    $q = $q . '/page-' . $new_page ;
  }
  if($q==$homepage_replace_node)
   return "<a href=\"/\">".$text."</a>";
  return l($text, $q, $attributes, count($query) ? implode('&', $query) : NULL);
}

And then you need to add these 2 lines to your .htaccess file (The second line is mentioned all ready in previous posts, the first line is new) Make sure to replace the word "blog" in the first line with the word you chose to use in the code snipet above.

# make sure to replace "blog" with the value you are using for $homepage_replace_node
  RewriteRule blog/page-(.*)$ node&page=$1
  RewriteRule (.*)/page-(.*)$ $1&page=$2

now working with panel

j2r's picture

I implemented this solution and it is working fine.
But when i try to load view in panel and passing the argument from panel it is not working :(

the url is like "www.example.com/panelurl/argument1/page-1"
here the url for panel is = panelurl
the argument which i need in panel is = argument1

but view is getting both as argument

please give me solution

Is this D5?

malc0mn's picture

Because it seems to me that the l() function used here is the D5 syntax, D6 would be:

<?php
 
return l($text, $q, array('attributes' => $attributes, 'query' => count($query) ? implode('&', $query) : NULL));
?>

In fact,,

mercmobily's picture

Hi,

In fact, I think this should eventually be a standard feature of Drupal's pager: the ability to show itself in the query string, or at the end of the URL (that is, Drupal should be able to "get it" without the ModRewrite hack...)

Merc,

I'd also like to see clean

chrisshattuck's picture

I'd also like to see clean pagination become standard, but after posting about it in IRC, it sounds like people have good reasons not to use it, though I'm not clear yet on what those reasons are.

I've put together a module that will allow you to use clean pagination without hacking .htaccess (as I did in my example above). I've only tested it on a couple of sites, but it's a much simpler solution that is more flexible. There's also an option to make the links a bit more search engine friendly:

http://drupal.org/project/cleanpager

Chris

Chris Shattuck
Learn Drupal with over 1700 Drupal video tutorials

Thanks

manish@hrn.in's picture

Thanks working fine for me.

I am trying the above solution but it is not working

moneesh.koundal's picture

My current site in which i am using this feature having trailing / after the url is that is the reason it is not working over there. All the urls for pagination is changed fine but when i click on them it gives page not found like pagination is working fine for this url download/?page=1 but when i change .htaccess and the theme function for pagintation. it becomes as downlaod/page-1 but is shows page not found in this case kinldy help for this.

Few things more

moneesh.koundal's picture

I am using pathauto module for url aliasing and global redirect module also. Are these modules are conflicting the clean pagination url as when i disable these modules and user the src url node/nid then the pagination links start working but they still do not show the pagination correct.

moneesh.koundal's picture

i need clean pagination on the node page

reminder

j2r's picture

hi

This issue was discuss in 2009 but still after reading all the comments i did not get any generic solution....

so can anyone help me out ....
Thanks in advance... :)

How generic?

Improvement

j2r's picture

The code given by drupalarchitect is working fine but it will disable the views ajax.
TO use ajax for pager working with this code you need to add condition in phptemplate_pager_link code

i am using ajax on front page only so my condition is like
if(drupal_is_front_page())
{
return theme_pager_link($text, $page_new, $element, $parameters, $attributes);
}
else
{
drupalarchitect's code
}

In same way user can specify to specific URLs for ajax or for the pager code...

Another issue with pager view

Search Engine Optimization (SEO)

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week