如何停止谷歌抓取不存在的页面 - how to stop google crawl unexisting page

how to stop google crawl unexisting page

当我开发我的网站时。我在一个地方打了一个错别字，例如，我所有的页面都是dir1/dir2/page.htm/par1-par2，但我的错别字是dir1/dir2/page/par1-par2（注意：没有.htm）

它只生产了一天，但谷歌一直在抓取这些链接。如何阻止谷歌这么做？

顺便说一下，这不是一页，而是成百上千页。

尝试使用robots.txt拒绝访问此页面（url）

http://www.robotstxt.org/robotstxt.html

http://support.google.com/webmasters/bin/answer.py?hl=en&答案=156449

在此处测试robots.txt：http://www.frobee.com/robots-txt-check/

patterns must begin with / because robots.txt patterns always match absolute URLs. 
* matches zero or more of any character. 
$ at the end of a pattern matches the end of the URL; elsewhere $ matches itself. 
* at the end of a pattern is redundant, because robots.txt patterns always match any URL which begins with the pattern.

如果页面存在（可能是因为你使用mod_rewrite），并且呈现了一个找不到的自定义页面，但没有发送http 410 Gone标头header("HTTP/1.0 410 Gone");，那么谷歌不会知道它已经被删除，并对它进行索引。

你需要添加正确的标题或删除页面，或者不呈现你自己的404，所以它会到达你的服务器404，然后谷歌会从索引中删除页面，而且页面的删除不会在晚上发生：

你也可以将url添加到robots.txt文件中，这也不能保证将页面从索引中删除，你可以像其他人所说的那样联系谷歌，但也不能保证得到回复或删除。

User-agent: *
Disallow: /dir1/dir2/page/par1-par2

祝你好运。

Google有一个表单，您可以要求它从索引中删除页面。

查看此链接的信息：

http://support.google.com/webmasters/bin/answer.py?hl=en&答案=164734