<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Managing Robot&#8217;s Access To Your Website</title>
	<atom:link href="http://janeandrobot.com/library/managing-robots-access-to-your-website/feed" rel="self" type="application/rss+xml" />
	<link>http://janeandrobot.com/library/managing-robots-access-to-your-website</link>
	<description>Search friendly design patterns for web development</description>
	<lastBuildDate>Fri, 19 Feb 2010 08:49:52 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Daniel Noll</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-425</link>
		<dc:creator>Daniel Noll</dc:creator>
		<pubDate>Wed, 02 Dec 2009 03:39:40 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-425</guid>
		<description>@Nathan:  Thanks a million.  I came to the same solution a few minutes after I posted my comment, but I just wasn&#039;t confident about it.  Only after your reply, did I implement it.</description>
		<content:encoded><![CDATA[<p>@Nathan:  Thanks a million.  I came to the same solution a few minutes after I posted my comment, but I just wasn&#8217;t confident about it.  Only after your reply, did I implement it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nathan Buggia</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-424</link>
		<dc:creator>Nathan Buggia</dc:creator>
		<pubDate>Tue, 01 Dec 2009 06:34:36 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-424</guid>
		<description>@Daniel Noll, hey daniel, you&#039;re right, you would need to incorporate both the * and the $ like this:

Disallow: /*/aaa/bbb/$</description>
		<content:encoded><![CDATA[<p>@Daniel Noll, hey daniel, you&#8217;re right, you would need to incorporate both the * and the $ like this:</p>
<p>Disallow: /*/aaa/bbb/$</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Noll</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-422</link>
		<dc:creator>Daniel Noll</dc:creator>
		<pubDate>Sun, 29 Nov 2009 01:22:05 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-422</guid>
		<description>Terrifically thorough. But I have one question that I hope you might be able to clarify.

Say for example I wish to disallow any URL in my site that ends with:   &quot;/aaa/bbb/&quot;
Would this entry sufficient:
Disallow: /aaa/bbb/

Or must I somehow use * and $ in the entry?</description>
		<content:encoded><![CDATA[<p>Terrifically thorough. But I have one question that I hope you might be able to clarify.</p>
<p>Say for example I wish to disallow any URL in my site that ends with:   &#8220;/aaa/bbb/&#8221;<br />
Would this entry sufficient:<br />
Disallow: /aaa/bbb/</p>
<p>Or must I somehow use * and $ in the entry?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Godwin</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-421</link>
		<dc:creator>Godwin</dc:creator>
		<pubDate>Tue, 17 Nov 2009 14:47:21 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-421</guid>
		<description>This is a very thorough report. I have bookmarked it and will reference to it. Good luck to your endeavors! :)</description>
		<content:encoded><![CDATA[<p>This is a very thorough report. I have bookmarked it and will reference to it. Good luck to your endeavors! :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andy Beard</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-407</link>
		<dc:creator>Andy Beard</dc:creator>
		<pubDate>Sun, 01 Nov 2009 11:52:12 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-407</guid>
		<description>&lt;blockquote&gt;GoogleBot follows the most specific directive, ignoring all others. In the robots.txt file, if you specify a section for all user-agents (user-agent: *) and also declare a section for Googlebot (user-agent: Googlebot), Google will disregard all sections in the robots.txt file except the Googlebot section. This could potentially leave you exposing much more content to Google that you might have thought.&lt;/blockquote&gt;

I have been seeing some evidence that Google can take the wildcard user-agent into consideration even when there is a specific declaration present.
Is this something new/official? It would make sense from a human logic perspective, but is a change to published material.</description>
		<content:encoded><![CDATA[<blockquote><p>GoogleBot follows the most specific directive, ignoring all others. In the robots.txt file, if you specify a section for all user-agents (user-agent: *) and also declare a section for Googlebot (user-agent: Googlebot), Google will disregard all sections in the robots.txt file except the Googlebot section. This could potentially leave you exposing much more content to Google that you might have thought.</p></blockquote>
<p>I have been seeing some evidence that Google can take the wildcard user-agent into consideration even when there is a specific declaration present.<br />
Is this something new/official? It would make sense from a human logic perspective, but is a change to published material.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gary Trotter</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-403</link>
		<dc:creator>Gary Trotter</dc:creator>
		<pubDate>Thu, 22 Oct 2009 10:47:44 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-403</guid>
		<description>Hi

I have allsorts of problems with Google accessing URLs that return 404.  Unfortunately the URLs that Google is accessing were erroneously generated by my website and I do not know exactly what they are.  As a result of this I am looking to setup my robots.txt file to disallow access to everything and then allow access to named URLs.  However I cannot figure out the syntax to allow access to my website&#039;s homepage (www.wedding-favours-online.co.uk) having previously used the disallow command to disallow all access.  Does this make any sense?

Thanks

gary</description>
		<content:encoded><![CDATA[<p>Hi</p>
<p>I have allsorts of problems with Google accessing URLs that return 404.  Unfortunately the URLs that Google is accessing were erroneously generated by my website and I do not know exactly what they are.  As a result of this I am looking to setup my robots.txt file to disallow access to everything and then allow access to named URLs.  However I cannot figure out the syntax to allow access to my website&#8217;s homepage (www.wedding-favours-online.co.uk) having previously used the disallow command to disallow all access.  Does this make any sense?</p>
<p>Thanks</p>
<p>gary</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nathan</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-398</link>
		<dc:creator>nathan</dc:creator>
		<pubDate>Sun, 04 Oct 2009 16:14:23 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-398</guid>
		<description>@Andy H - yes, you absolutely can. The search engine would download the page, process the links and then discard the page so that it would not show up in search results. 

However, this may not be the best approach for handling site search results, because there are still an infinite number (or at least very large) number of pages that could be created through your site search. A better approach would be to make sure that your sitemap was added to your robots.txt file (using the Sitemap: auto-discovery directive). And you should ensure that your sitemap is high quality, e.g. all URLs should be in their canonical form, and it should not list links to pages that return 404, or 500 server errors.</description>
		<content:encoded><![CDATA[<p>@Andy H &#8211; yes, you absolutely can. The search engine would download the page, process the links and then discard the page so that it would not show up in search results. </p>
<p>However, this may not be the best approach for handling site search results, because there are still an infinite number (or at least very large) number of pages that could be created through your site search. A better approach would be to make sure that your sitemap was added to your robots.txt file (using the Sitemap: auto-discovery directive). And you should ensure that your sitemap is high quality, e.g. all URLs should be in their canonical form, and it should not list links to pages that return 404, or 500 server errors.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andy H</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-397</link>
		<dc:creator>Andy H</dc:creator>
		<pubDate>Wed, 30 Sep 2009 08:15:23 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-397</guid>
		<description>Hmmm, not sure if this comment will get found, but I have a question:

Could you put a META Robots Tag on the results pages that said &quot;NOINDEX, FOLLOW&quot;?

Say you have a large site with a bunch of &quot;Search Result&quot; style pages leading to quality, unique asset pages.  According to Webmaster guidelines, you wouldn&#039;t want the results pages indexed, but you would want the asset pages indexed, would this tag work?</description>
		<content:encoded><![CDATA[<p>Hmmm, not sure if this comment will get found, but I have a question:</p>
<p>Could you put a META Robots Tag on the results pages that said &#8220;NOINDEX, FOLLOW&#8221;?</p>
<p>Say you have a large site with a bunch of &#8220;Search Result&#8221; style pages leading to quality, unique asset pages.  According to Webmaster guidelines, you wouldn&#8217;t want the results pages indexed, but you would want the asset pages indexed, would this tag work?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: The Ultimate SEO Audit</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-385</link>
		<dc:creator>The Ultimate SEO Audit</dc:creator>
		<pubDate>Mon, 24 Aug 2009 19:38:19 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-385</guid>
		<description>[...] 1. Make sure Meta robots.txt isn&#8217;t blocking the whole site. JaneAndRobot offers a great guide for managing Robot&#8217;s access to your website. [...]</description>
		<content:encoded><![CDATA[<p>[...] 1. Make sure Meta robots.txt isn&#8217;t blocking the whole site. JaneAndRobot offers a great guide for managing Robot&#8217;s access to your website. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adrian Hemsley</title>
		<link>http://janeandrobot.com/library/managing-robots-access-to-your-website/comment-page-2#comment-352</link>
		<dc:creator>Adrian Hemsley</dc:creator>
		<pubDate>Thu, 25 Jun 2009 10:27:32 +0000</pubDate>
		<guid isPermaLink="false">/post/Managing-Robots-Access-To-Your-Website.aspx#comment-352</guid>
		<description>That&#039;s about as through explanation as I&#039;ve seen! :-)</description>
		<content:encoded><![CDATA[<p>That&#8217;s about as through explanation as I&#8217;ve seen! :-)</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic page generated in 8.117 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2010-03-09 22:30:10 -->
