Accessing dynamic javasript with Ruby

Hey all

I'm experimenting with writing a scraper at the moment and have hit a major hump.

Part of the DOM is added after the page has loaded via javascript.

This means when I make an a request the HTML response I receive back doesn't accurately represent the page.

Here's a simplified example:

@http_obj = Net::HTTP.new("targetdomain.com")

response, page_data = @http_obj.request_get( "/" )

# page data doesn't contain all of the HTML that is actually shown

Is there anyway library or gem that could simulate the browser updating the DOM with the Javascript or any other way I could approach this short of decoding the obfuscated Javascript file?

Thanks in advance

Gav

Gavin Morrice wrote:

Hey all

I'm experimenting with writing a scraper at the moment and have hit a major hump.

Part of the DOM is added after the page has loaded via javascript.

This means when I make an a request the HTML response I receive back doesn't accurately represent the page.

Here's a simplified example:

@http_obj = Net::HTTP.new("targetdomain.com")

response, page_data = @http_obj.request_get( "/" )

# page data doesn't contain all of the HTML that is actually shown

Is there anyway library or gem that could simulate the browser updating the DOM with the Javascript or any other way I could approach this short of decoding the obfuscated Javascript file?

Try Selenium or some other remote browser control.

Thanks in advance

Gav

Best,