<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Chunk Data for Easier Scraping</title>
	<atom:link href="http://www.adammoro.com/blog/chunk-data-for-easier-scraping.html/feed" rel="self" type="application/rss+xml" />
	<link>http://www.adammoro.com/blog/chunk-data-for-easier-scraping.html</link>
	<description>Internet Marketing, Web Development and Programming Stuff</description>
	<lastBuildDate>Wed, 15 Sep 2010 02:35:24 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: arraial dajuda</title>
		<link>http://www.adammoro.com/blog/chunk-data-for-easier-scraping.html/comment-page-1#comment-876</link>
		<dc:creator>arraial dajuda</dc:creator>
		<pubDate>Wed, 15 Sep 2010 02:35:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.adammoro.com/blog/?p=56#comment-876</guid>
		<description>Some interesting codes here! I have to learn Python as fast as possible!</description>
		<content:encoded><![CDATA[<p>Some interesting codes here! I have to learn Python as fast as possible!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adam Moro</title>
		<link>http://www.adammoro.com/blog/chunk-data-for-easier-scraping.html/comment-page-1#comment-296</link>
		<dc:creator>Adam Moro</dc:creator>
		<pubDate>Fri, 14 May 2010 22:58:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.adammoro.com/blog/?p=56#comment-296</guid>
		<description>I&#039;ll definitely be looking into the libraries you suggested for the next ones. Thanks again for pointing them out. 

Perhaps one day I&#039;ll make a full switch to Python for these types of jobs. Recently, however, the majority of my work has been almost entirely marketing-related so that day will likely come later than sooner. But hey, at least that&#039;s a choice I&#039;ve made and not an order fulfilled to satisfy, &quot;the first three.&quot; ;)</description>
		<content:encoded><![CDATA[<p>I&#8217;ll definitely be looking into the libraries you suggested for the next ones. Thanks again for pointing them out. </p>
<p>Perhaps one day I&#8217;ll make a full switch to Python for these types of jobs. Recently, however, the majority of my work has been almost entirely marketing-related so that day will likely come later than sooner. But hey, at least that&#8217;s a choice I&#8217;ve made and not an order fulfilled to satisfy, &#8220;the first three.&#8221; <img src='http://www.adammoro.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kip</title>
		<link>http://www.adammoro.com/blog/chunk-data-for-easier-scraping.html/comment-page-1#comment-281</link>
		<dc:creator>kip</dc:creator>
		<pubDate>Thu, 13 May 2010 08:53:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.adammoro.com/blog/?p=56#comment-281</guid>
		<description>Python is my favourite. I switched from PHP to Python as well, indeed it was a littlebit awkward to learn at first--syntactical whitespace and that &#039;self&#039; keyword in class methods everywhere--but after 2 or 3 simple quicky scripts, I got the hang of it. I wouldn&#039;t go back to PHP, I think Python is much &quot;cleaner&quot; and readable, and to be fair, the string processing functions are much more straightforward.

Plus, you can use PyQuery for doing jQuery-style manipulation and extraction of HTML documents.

Another language, which was basically built for this kind of work, is of course Perl. It&#039;s a pretty good language, and has the added bonus of making you feel like an oldskool elite hacker when coding :-P But, Perl is an even older language than PHP, and that shows. The field of scripting languages has come a long way since then, and we know better how to make scripting languages as much of &quot;simply tell the computer what to do&quot; as possible, which is why I prefer Python.</description>
		<content:encoded><![CDATA[<p>Python is my favourite. I switched from PHP to Python as well, indeed it was a littlebit awkward to learn at first&#8211;syntactical whitespace and that &#8216;self&#8217; keyword in class methods everywhere&#8211;but after 2 or 3 simple quicky scripts, I got the hang of it. I wouldn&#8217;t go back to PHP, I think Python is much &#8220;cleaner&#8221; and readable, and to be fair, the string processing functions are much more straightforward.</p>
<p>Plus, you can use PyQuery for doing jQuery-style manipulation and extraction of HTML documents.</p>
<p>Another language, which was basically built for this kind of work, is of course Perl. It&#8217;s a pretty good language, and has the added bonus of making you feel like an oldskool elite hacker when coding <img src='http://www.adammoro.com/blog/wp-includes/images/smilies/icon_razz.gif' alt=':-P' class='wp-smiley' />  But, Perl is an even older language than PHP, and that shows. The field of scripting languages has come a long way since then, and we know better how to make scripting languages as much of &#8220;simply tell the computer what to do&#8221; as possible, which is why I prefer Python.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adam Moro</title>
		<link>http://www.adammoro.com/blog/chunk-data-for-easier-scraping.html/comment-page-1#comment-262</link>
		<dc:creator>Adam Moro</dc:creator>
		<pubDate>Mon, 10 May 2010 16:52:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.adammoro.com/blog/?p=56#comment-262</guid>
		<description>Hi Kip, thanks for the tip. The only other language I&#039;ve used for scraping jobs is Python which was much faster performance-wise, it just took me a lot longer to write the scripts. Which language(s) would you suggest?</description>
		<content:encoded><![CDATA[<p>Hi Kip, thanks for the tip. The only other language I&#8217;ve used for scraping jobs is Python which was much faster performance-wise, it just took me a lot longer to write the scripts. Which language(s) would you suggest?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kip</title>
		<link>http://www.adammoro.com/blog/chunk-data-for-easier-scraping.html/comment-page-1#comment-253</link>
		<dc:creator>kip</dc:creator>
		<pubDate>Sun, 09 May 2010 09:46:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.adammoro.com/blog/?p=56#comment-253</guid>
		<description>um yeah, except if you gonna parse HTML better use a library like PHPQuery (jQuery for PHP)

or you could replace /\W+/ with a single space instead of the empty string, to leave delimiters in tact.

and ... PHP? really?</description>
		<content:encoded><![CDATA[<p>um yeah, except if you gonna parse HTML better use a library like PHPQuery (jQuery for PHP)</p>
<p>or you could replace /\W+/ with a single space instead of the empty string, to leave delimiters in tact.</p>
<p>and &#8230; PHP? really?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

