Listing all URLs for a Drupal site - for 301 redirect from old site

Hello,

I am using Xenu's Link Sleuth (PageRank 7) to generate a list of URLs for a drupal site (and also find all the broken links). I am migrating a site to Drupal CMS and need to map the old URLs to the new URLs using a 301 redirect to move over all the SEO ranking values to the new URLs. Using Xenu's Link Sleuth has always been helpful but I was really surprised when I saw hundreds upon hundreds of URLs that are dynamically generated by Drupal.

Is there anyway to get a shorter list of the URL for 301 redirect purposes. I know I could use URL alias to list all the aliased URLs but that would not catch all the non-aliased paths.

All the best,
Guy Saban

Login to post comments

Sitemaps

J. Cohen's picture
J. Cohen - Fri, 2009-06-05 00:34

I use this tool:
http://www.auditmypc.com/free-sitemap-generator.asp

It obeys robots.txt if you want it to, and exports to spreadsheet. You can then select the column of URLs, paste into a text editor like Vim or Notepad++ and run macros on it to format the URLs how you want them.

EDIT: if you want to get rid of certain types of URLs run the text file of URLs through grep -v. See these tutorials for some more info on grep:
http://tips.webdesign10.com/a-grep-tutorial
http://tips.webdesign10.com/extracting-search-engine-hits-from-log-files

Example - to get rid of all URLs with ampersands in them you should be able to do something like this on the list of URLs:
grep -v '&' url_list.txt >new_url_list.txt

--
My Drupal Tutorials


Thanks for the info. much

guysaban - Mon, 2009-06-08 12:14

Thanks for the info. much appreciated.