Never List URLs With Session IDs In A Google Sitemap

Google’s JohnMu warned one webmaster over the New Years break in a Google Webmaster Help thread to never ever list URLs with session IDs in the Sitemap XML file.

John said:

If you are not submitting clean URLs in your Sitemap file, you’d be better off not using a Sitemap file. With session-IDs in there, it’ll cause more problems (with us crawling and indexing those URLs) than if you just let us crawl your website normally (especially if you really have a clean URL structure). So my advice would be to either delete the Sitemap file, or make sure that the submitted URLs are really exactly the same, clean ones that we find while crawling.

To most of us, this is obvious. But sometimes the obvious needs to be said.

Sending Google duplicate URLs for the same landing page is asking for trouble. Why hand Google duplicate content on a silver plater? That is what you are doing when you are listing these URLs in a Sitemap file. If you have duplicate content on your site, and you don’t block it from Google, then Google has a better shot of figuring it out based on crawling your site versus seeing it in a Sitemap file.

No duplicate URLs or session ID based URLs should be listed in a Sitemap file.

Forum discussion at Google Webmaster Help.