justuptime.com - monitor your servers & websites

Robots exclusion question.

You are viewing this site as a guest. Join our community to get your questions answered and share knowledge. Active members may advertise and ask for a website critique.

They have: 16 posts

Joined: Nov 2006

Hi,

I have many hundreds of pages created by a database in the following format:

domain.com/availability.asp?id=26&theyear=2007&themonth=9
domain.com/availability.asp?id=26&theyear=2007&themonth=10
domain.com/availability.asp?id=26&theyear=2007&themonth=11

and so on.

I don't want these indexed as they are duplicates of each other.

What is the best way?

Should I ad the robots noindex/nofollow meta to each page? Or, if I exclude /availability.asp in the robots.txt, will that also dissalow all of the files woth the ?id= part too?

They have: 25 posts

Joined: Mar 2007

Hi Hampstead,

You can actually single out certain parameters for the Googlebot. Adding these lines to your to your robots.txt file tells the Googlebot not to index any URL's with "theyear" and "themonth" parameters:

User-agent: Googlebot
Disallow: /*theyear=
Disallow: /*themonth=

'

If you were to exclude /availability.asp in your robots.txt file, it would also exclude the URL's with the "id" parameter.

More info on the Googlebot wildcard here: Google Robots.txt Wildcard

Smiling

They have: 16 posts

Joined: Nov 2006

Thanks for this. I do not want any URLs with the id parameter indexed anyway so perhaps excluding /availability.asp would be the way forward?

They have: 25 posts

Joined: Mar 2007

You're welcome.

And yes, if you don't want the "id" pages indexed either, blocking the whole page is the way to go.

Smiling