robots file

greg's picture

He has: 1,581 posts

Joined: Nov 2005

If I ban all search engines with a robots.txt, obviously the rogue ones will ignore it. I presume the good ones (google, yahoo etc) will still honour the robots.txt they found and ignore anything they find elsewhere?

Such as a link on a 3rd party website or another search engine masquerading as a website of content.

So ones that obey the robots will never index anything from my site regardless of what and where they find stuff?

They have: 25 posts

Joined: Sep 2009

Thanks for information, keep it up with robots.txt.

Michael James Swan's picture

He has: 400 posts

Joined: May 2008

I remember a case a little while ago in which someone did this but there pages still ended up within Googles Index.

This happened because someone else had linked to a page which "Robots, did not allow to be indexed" so it was not indexed; but merly noticed and listed by the Search Engine.

sequencehosting's picture

They have: 24 posts

Joined: Feb 2010

ms2134 wrote:
I remember a case a little while ago in which someone did this but there pages still ended up within Googles Index.

This happened because someone else had linked to a page which "Robots, did not allow to be indexed" so it was not indexed; but merly noticed and listed by the Search Engine.

This is true and still happens today. I believe the best way to remove a site is using web master tools.

They have: 5 posts

Joined: Jan 2010

I have had a problem where the robots did still come to my site. But then again, it never really hurt me at all.

They have: 12 posts

Joined: Feb 2010

Usually your site content, if crawled by scrapers and reposted will show up on Google including any links pointing to your site. But Google will usually obey robots.txt and not show your website directly in results.

Robots which do not obey the protocol can still crawl your website. Content scrapers can still work.

They have: 36 posts

Joined: Apr 2010

will this robots.txt file work with all search engines

Buy Mobile Phone - Mobile Phone Deals

They have: 22 posts

Joined: Oct 2010

robot.txt files will help you to hide that content that yiu dont want to show to search engines....this file works on all search engines...

They have: 20 posts

Joined: Nov 2010

You can ban all search engine using robot.txt. But some search enigine meybe doesnt check your file and index your page.

{links removed}

They have: 5 posts

Joined: Feb 2010

The robots.txt file is a set of instructions for visiting robots (spiders) that index the content of your web site pages. For those spiders that obey the file, it provides a map for what they can, and cannot index. The file must reside in the root directory of your web.

They have: 20 posts

Joined: Nov 2010

meybe you can try to ban user-agent?!

They have: 111 posts

Joined: Aug 2010

the functionality of robot.txt file is to avoid or to not allow search engine from some of your website pages, this is done whenever if some of your pages are in construction phase, and you dont want search engine to index them.. you can put them there,

robots.txt file can placed in the main directory with index.html page...the same place..

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.