Comments invited - new module to help with path-management and avoid broken links: http://drupal.org/project/pathfinder
Code will be uploaded very shortly.
Why not just use Pathauto plus redirect etc? - In a nutshell, simplicity: http://drupal.org/node/1783906
The module is available and working, just the UI for configuration needs completion. If you want to test it and can handle putting settings into settings.php - send me a message.
Drupal consulting and development, multi-language specialists. netgenius.co.uk
Initial version is now at http://drupal.org/project/pathfinder
This looks like a brilliant solution for non-ASCII urls, like when you use pathauto with Arabic node titles. If I can keep the short version of the url in the browser's address bar, and use the long url-encoded version for SEO, this module will be as important as views to me.
That's an application that hadn't occurred to me.
To do what you're suggesting (keep the short version in the browser address bar) then redirection to the "long" url would have to somehow happen only for search engine spiders. That's feasible, but not something I've catered for.
To use a live example: http://wordsinside.com/rz - you will get redirected to: http://wordsinside.com/en/rz/lp/english-spanish-quality-professional-con... (if your browser it set for Spanish as preferred language you'll get redirected to the Spanish version.) The redirection could be switched off so that the browser stayed at http://wordsinside.com/rz - that would need some rules in the redirector and a list of client IPs or hostnames that should be redirected.
I'd be prepared to add this feature, it's an interesting idea, and pretty trivial to code (do you have the list of spiders though!???). Feel free to PM me if you prefer.
This module is proving useful to me in practice - setting up a new site, we are using PathAuto to generate friendly urls based on node titles, content-type names, taxonomy terms, etc. But, as the site evolves we've already changed things various times, and edited quite a few node titles, taxonomy terms and other elements. The original PathAuto aliases would no longer work, but because we're using PathFinder, we don't have to worry about that. The alternative (PathAuto + Redirect) to automatically add aliases would, by now, have added a lot of aliases/redirects - that would work too, but I much prefer PathFinder as a solution, it seems much cleaner.
New demo page: http://wordsinside.com/tx
http://wordsinside.com/tx/anything-you-want-here will also work.
in your example, I am redirected to
if the node title is in Arabic, you could get an ugly and frightening url. Try a node with the following title with pathauto (copy and paste)
حماة الديار عليكم سلام أبت أن تذل النفوس الكرام
some browsers will automatically urldecode() the url, but if you want to paste the link or share it, you will see.
Here is what I need:
Number 2 can be done by using a special theme for spiders based on the user agent (themekey, context, ...). I think this is outside the scope of this node.
Number 5 is a direct result of number 4.
My problem is 1,3 and 4. How can we do that?
Here is the url of the google search for the Arabic phrase above
1. xmlsitemap uses urls like: /tx/some node title in some language
That would work out-of-the-box.
2. spider visits the page and keeps using the long url for it.
Should also work without changes.
3. user (human) clicks on that url and is automatically redirected to /tx
Here's the change - that's more or less reverse to what PathFinder Redirect would currently do. So, feasible, but would need a different version of the redirect module.
4. nodewords will declare /tx as the canonical uri or permalink
I'm not sure what would happen with the current PathFinder and nodewords - could you test it?
5. when shared on facebook/twitter or others, /tx will be the url detected.
So you mean if a user posts/tweets the long url then it would be converted to the short /tx version? Would only work for fb's preview of the page:
Facebook shows the final target (if redirected) in its preview of the referenced page, but I don't think it changes the url actually posted as shown in the body of the post. That's under fb's control of course.
Twitter doesn't change what's posted even if the url is redirected (or 404 etc.) Test: http://t.co/A4XAUSE9 redirects to http://wordsinside.com/tx (done by twitter) but http://wordsinside.com/tx redirects to http://wordsinside.com/en/tx/drupal-site-uses-path-finder-help-path-alia... (done by PathFinder Redirect). So, no solution - Twitter is, understandably, using the url given, not the final target.
In summary, some of what you're asking is not feasible, as Facebook and Twitter won't support it - they will respect whatever url the user posted.
If the user pasted the long url, it is his problem. I wouldn't worry about that. What I need is to provide him with the short url in the address bar, and to make the short url canonical.
However, If the short url is shown in the address bar, and used in internal links inside the site, where would the user get the long url from?
Oops, I forgot about this thread :)
Just posted beta2 with a bug-fix, now redirects via HTTP 301 instead of 302.
See all hot content.
Drupal is a registered trademark of Dries Buytaert.