<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>&#60;title&#62; &#187; spider ips</title>
	<atom:link href="http://www.adammoro.com/blog/tag/spider-ips/feed" rel="self" type="application/rss+xml" />
	<link>http://www.adammoro.com/blog</link>
	<description>Internet Marketing, Web Development and Programming Stuff</description>
	<lastBuildDate>Fri, 19 Nov 2010 19:58:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Another Reason Web Analytics Software Should Incorporate An Industrial-strength Bot List</title>
		<link>http://www.adammoro.com/blog/web-analytics-bot-list.html</link>
		<comments>http://www.adammoro.com/blog/web-analytics-bot-list.html#comments</comments>
		<pubDate>Sun, 14 Feb 2010 15:15:52 +0000</pubDate>
		<dc:creator>Adam Moro</dc:creator>
				<category><![CDATA[Web Analytics]]></category>
		<category><![CDATA[googlebot]]></category>
		<category><![CDATA[spider ips]]></category>

		<guid isPermaLink="false">http://www.adammoro.com/blog/?p=40</guid>
		<description><![CDATA[I wrote a real simple script to mess with some of my friends which basically makes it seem like Googlebot visited their site, well, as many times as I wanted them to think. Don't worry, I'll provide the script in a minute. But it reminded me of a tweet from @fantomaster in which he shared ...]]></description>
			<content:encoded><![CDATA[<p>I wrote a real simple script to mess with some of my friends which basically makes it seem like <a title="Google's Spider also known as Googlebot" href="http://www.google.com/bot.html">Googlebot</a> visited their site, well, as many times as I wanted them to think. Don't worry, I'll provide the script in a minute. But it reminded me of a tweet from <a title="fanotomaster's Twitter page" href="http://www.twitter.com/fantomaster">@fantomaster</a> in which he shared how important it is to incorporate a bot list (i.e. a <a title="Spider IP Address Database" href="http://searchbotbase.com/">spider IP database</a>) with a web analytics software/program. I can't seem to locate the tweet now (if you can, please drop it in a comment). So here's another example that further makes the case of how important it actually is...</p>
<p><a href="http://www.adammoro.com/blog/wp-content/uploads/2010/02/awstats-spider-robots-report.png"><img class="alignnone size-full wp-image-44" title="Awstats Spider/Robot Report" src="http://www.adammoro.com/blog/wp-content/uploads/2010/02/awstats-spider-robots-report.png" alt="Awstats Spider/Robot Report" width="600" /></a></p>
<p>This is a screenshot of the "Robots/Spiders visitors" report from Awstats (the default web analytics program included with cPanel). As you can see, Googlebot visited this particular domain 1,126 times since February 14th 2010, which is today. If Awstats cross-checked the IP address from which my script ran against a bot list like the one provided at <a title="Bot list" href="http://searchbotbase.com/">searchbotBase</a>, it would not have logged the hits. As an SEO, you can of course see the problem here in relying on such statistics. For example, this could make it real difficult for an SEO who cannot install <a title="Google Webmaster Tools" href="http://www.google.com/webmasters/tools/">Google Webmaster Tools</a> and needs to diagnose crawl rates. A victim of a more calculated attack by this script might think everything is hunky dory when, in reality, their site could be experiencing real problems that may get overlooked.</p>
<h2>The Googlebot Traffic Generation Script</h2>
<pre>#!/usr/bin/php
&lt;?php

$referer = $argv[1];
$url = $argv[2];

$useragent = "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)";

for ($i =0; $i &lt; rand(20,80); $i++) {
	system('curl -v -A '.$useragent.' -e '.$referer.' -L '.$url);
}
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.adammoro.com/blog/web-analytics-bot-list.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

