update page with results as they are scraped

Hi,

Im creating a screenscraper app that takes a users search term, scrapes several 3rd party sites and returns the aggregated results after a few seconds.

Rather than wait for all the sites scraping to be completed before showing the results, I want to show the results as they are scraped, ie. when one sites results have been scraped and prepared, they are shown immediately whilst the next set are being done.

Initially I thought id just make a scraper function like so

def scrape_sites

sites.each do |a_site|    a_sites_results = scrape_and_parse_this_site(a_site)    yield a_sites_results end end

Then in my controllers js response I could take the results and render them but then i found out htat you cant make multiple render calls in one go.

How do i go about doing this? Does this behaviour have a name like say "live search" etc

ive found that this pattern is called multi stage download. Its used by kayak.com which is famous for its user interface. That site makes finding cheap flights actually enjoyable.

Here are some general background links i found

this discusses the pattern in general

http://ajaxpatterns.org/Multi-Stage_Download#Code_Example

and this is a review of the user interface for kayak

http://konigi.com/podcast/kayak-com

But i still dont know where to start looking for a rails implementation how to.

Where should i start looking (ive googled for ages, cant find anything appropriate)

Given that this is all client side, what would a rails implementation be (maybe the odd helper function but the lack of them certainly shouldn't stop you getting started).

As far as your particular case goes all you need is something on the page that every however often pings your controller to say 'do you have any more data for me'. It would be wise to push the actual scraping into a separate process

Fred

Given that this is all client side, what would a rails implementation be (maybe the odd helper function but the lack of them certainly shouldn't stop you getting started).

As far as your particular case goes all you need is something on the page that every however often pings your controller to say 'do you have any more data for me'. It would be wise to push the actual scraping into a separate process

Thanks Frederick,

cna you (or anybody reading this) provide me with some helper names or subject names that i should be investigating re: pining the
controller?

is this a common rails thing?

thanks so far for your help

To make things clear, pinging just means (in this context) make a
request every so often. If you're using prototype PeriodicalExecuter
is a good way to go (periodically_call_remote is a helper for that)

Fred

I do something similar in one of my apps. The way I handle it is using periodically_call_remote every 6 seconds and check it against a controller action which looks at the DB to see if any new records were added for that user since the last time it checked. If there are, do a render :update |page| and add an insert_html with the new record on top.

Naturally you can increase or decrease the time between checks, but I figure 5-10 seconds should be good for anyone.

For my specific app, if I was scraping 5 items specifically, the periodically_call_remote function would turn itself off after I returned 5 records.

Jack Bauer wrote:

I do something similar in one of my apps. The way I handle it is using periodically_call_remote every 6 seconds and check it against a controller action which looks at the DB to see if any new records were added for that user since the last time it checked. If there are, do a render :update |page| and add an insert_html with the new record on top.

Naturally you can increase or decrease the time between checks, but I figure 5-10 seconds should be good for anyone.

For my specific app, if I was scraping 5 items specifically, the periodically_call_remote function would turn itself off after I returned 5 records.

Hi Jack,

How do you turn off a periodically_call_remote? Also is it possible to call periodically_call_remote on some action?

Thanks, Sudhindra