<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Evaluating XPaths with indices in HPricot</title>
	<link>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/</link>
	<description>Experiences with Ruby and Rails, Web2.0 and other development technologies</description>
	<pubDate>Thu, 28 Aug 2008 19:24:49 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.2</generator>

	<item>
		<title>by: rhubarb</title>
		<link>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-1052</link>
		<pubDate>Fri, 03 Nov 2006 19:37:45 +0000</pubDate>
		<guid>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-1052</guid>
					<description>&lt;p&gt;Funny I was looking at Hpricot, and then looking at alternatives (I want to scrape my online bank statements), when I found your original - nicely detailed - page on scraping written in June. I was researching all of these suggestions you made until I read to the last comment and saw that you'd settled on Hpricot anyway. ;)&lt;/p&gt;

&lt;p&gt;What I'd like to know is do you still maintain that Mechanize is the best way to do the navigation, and then maybe Hpricot for the scraping once you have the page?&lt;/p&gt;

&lt;p&gt;Or does Hpricot give you some way to navigate?&lt;/p&gt;

&lt;p&gt;If the former can you show us some Hpricot, Mechanize combined code?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Funny I was looking at Hpricot, and then looking at alternatives (I want to scrape my online bank statements), when I found your original - nicely detailed - page on scraping written in June. I was researching all of these suggestions you made until I read to the last comment and saw that you&#8217;d settled on Hpricot anyway. <img src='http://www.rubyrailways.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>What I&#8217;d like to know is do you still maintain that Mechanize is the best way to do the navigation, and then maybe Hpricot for the scraping once you have the page?</p>
<p>Or does Hpricot give you some way to navigate?</p>
<p>If the former can you show us some Hpricot, Mechanize combined code?</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: juzzin</title>
		<link>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-974</link>
		<pubDate>Sat, 21 Oct 2006 06:03:15 +0000</pubDate>
		<guid>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-974</guid>
					<description>&lt;p&gt;Sounds very similar to Ariel: http://ariel.rubyforge.org/&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Sounds very similar to Ariel: http://ariel.rubyforge.org/</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Peter Szinek</title>
		<link>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-936</link>
		<pubDate>Sun, 08 Oct 2006 07:46:24 +0000</pubDate>
		<guid>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-936</guid>
					<description>&lt;p&gt;Yeah, I know about scrAPI, but I am going to take a different (X)Path, both in the wrapper generation phase - instead of providing concrete XPaths (= CSS selectors in scrAPI), there will be a possibility to define the data you are looking for with examples, so in an ideal case you won't need to use XPath at all - and in the evaluation phase - lots of heuristics, XPath instead of CSS. The primary aim will be to create a wrapper generator which needs minimum input and technical knowledge (of course it will be possible to use XPaths or even Ruby if you wish) yet performs robust extractions and will be usable to quickly scrap sites like amazon, ebay etc. and further integrate the data into something usable in real-life.&lt;/p&gt;

&lt;p&gt;The goal is to create an easy-to-use wraper generator which &lt;b&gt;works in practice&lt;/b&gt; and will make possible to create mashups or use the extracted data further in any RoR or Ruby (or any other) app...&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Yeah, I know about scrAPI, but I am going to take a different (X)Path, both in the wrapper generation phase - instead of providing concrete XPaths (= CSS selectors in scrAPI), there will be a possibility to define the data you are looking for with examples, so in an ideal case you won&#8217;t need to use XPath at all - and in the evaluation phase - lots of heuristics, XPath instead of CSS. The primary aim will be to create a wrapper generator which needs minimum input and technical knowledge (of course it will be possible to use XPaths or even Ruby if you wish) yet performs robust extractions and will be usable to quickly scrap sites like amazon, ebay etc. and further integrate the data into something usable in real-life.</p>
<p>The goal is to create an easy-to-use wraper generator which <b>works in practice</b> and will make possible to create mashups or use the extracted data further in any RoR or Ruby (or any other) app&#8230;</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: ruby licious</title>
		<link>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-935</link>
		<pubDate>Sat, 07 Oct 2006 18:18:40 +0000</pubDate>
		<guid>http://www.rubyrailways.com/evaluating-xpaths-with-indicdes-in-hpricot/#comment-935</guid>
					<description>&lt;p&gt;webextraction projects are always interesting, what path are you taking If I may ask? =)&lt;/p&gt;

&lt;p&gt;I guess You've seen http://blog.labnotes.org/2006/07/11/scraping-with-style-scrapi-toolkit-for-ruby/  ? seems pretty decent.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>webextraction projects are always interesting, what path are you taking If I may ask? =)</p>
<p>I guess You&#8217;ve seen http://blog.labnotes.org/2006/07/11/scraping-with-style-scrapi-toolkit-for-ruby/  ? seems pretty decent.</p>
]]></content:encoded>
				</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.603 seconds -->
