How do i scrape dynamic content from Struts framework with Ruby

How do i scrape dynamic content from Struts framework with Ruby

Same as any web source: send a request, parse the response. Is
there some particular issue you're encountering?

So I just use browser.execute_script and pass in the full path https and query string just as it appears in the name column in Chrome-> Dev Tools?

browser.execute_script(‘https://gpsfront.sitename.com/getI2iRecommendingResults.do?callback=jQuery18307882644047005491_1545806199753&currentItemList=32819755026&categoryId=200001521&shopId=2339135&companyId=238468932&recommendType=&scenario=pcDetailLeftTopSell&limit=6&offset=0&_=1545806304149’)

Selenium::WebDriver::Error::UnknownError: unknown error: Runtime.evaluate threw exception: SyntaxError: Unexpected end of input

(Session info: chrome=71.0.3578.80)

(Driver info: chromedriver=2.42.591071 (0b695ff80972cc1a65a5cd643186d2ae582cd4ac),platform=Linux 4.15.0-43-generic x86_64)

Maybe I need to run it without .do extension and also mayb

You are passing an URL instead of a script.
This function “execute_script” it’s to execute sobre javascript “script”.

You’ve to visit the page with the browser support.

If you’re using Capybara, you shoul use the “visit” function passing the URL that you want to go.

But if not, you’ve to look which command your driver has to visit URL’s.

Must I use the url?

Yes, if you are using capybara, you may use visit '[http://myurl.com/goes-here'](http://myurl.com/goes-here’%60)

This is the first time I’m hearing Capybara recommended for web scraping Is this the preferred method for what I’m trying to do?

Capybara has a friendly interface for your web drivers, you can integrate it with selenium, webkit, poltergeist and other.

Try to use it, I think you will like it.

https://github.com/teamcapybara/capybara

I need to work in rails console and when I run visit ... rails complains of no matching route

You’ve to include Capybara::DSL.


include Capybara::DSL

require ‘capybara/rails’

include Capybara::DSL

No change I still get the same routing error ActionController::RoutingError (No route matches [GET] “/getI2iRecommendingResults.do”):

You'll probably get better answers if you show your work. Try writing a single script that demonstrates what you want to do, and post it as a Gist. Link it here, show what the output looks like, and see where that leads you. Often times, working in the constraints of making the example work in a single script forces you to reconsider the problem, or shows you a simple error you made while configuring something more complex.

Walter

Which params are you using for visit?

visit ‘https://gpsfront.sitename.com/getI2iRecommendingResults.do?callback=jQuery18307882644047005491_1545806199753&currentItemList=32819755026&categoryId=200001521&shopId=2339135&companyId=238468932&recommendType=&scenario=pcDetailLeftTopSell&limit=6&offset=0&_=1545806304149

I’d like to write a script after I get my commands down, for now I’m working in rails console It may not be the intended use of capybara to visit url’s not defined in routes.rb Should I be using something else to make the request

>
>
>
> You’ve to include Capybara::DSL.
>
> ```
> include Capybara::DSL
> ```
>
>
>
> Capybara has a friendly interface for your web drivers, you can integrate it with selenium, webkit, poltergeist and other.
> Try to use it, I think you will like it.
>
> https://github.com/teamcapybara/capybara
>
>
>
> Yes, if you are using capybara, you may use `visit 'http://myurl.com/goes-here’`
>
>
>
> You are passing an URL instead of a script.
> This function “execute_script” it’s to execute sobre javascript “script”.
> You’ve to visit the page with the browser support.
> If you’re using Capybara, you shoul use the “visit” function passing the URL that you want to go.
> But if not, you’ve to look which command your driver has to visit URL’s.
>
>
>
> >
> > How do i scrape dynamic content from Struts framework with Ruby
>
> Same as any web source: send a request, parse the response. Is
> there some particular issue you’re encountering?
>
> --
> Hassan Schroeder ------------------------ hassan.s...@gmail.com
> twitter: @hassan
> Consulting Availability : Silicon Valley or remote
>
> browser.execute_script(‘https://gpsfront.sitename.com/getI2iRecommendingResults.do?callback=jQuery18307882644047005491_1545806199753&currentItemList=32819755026&categoryId=200001521&shopId=2339135&companyId=238468932&recommendType=&scenario=pcDetailLeftTopSell&limit=6&offset=0&_=1545806304149’)
>
> Selenium::WebDriver::Error::UnknownError: unknown error: Runtime.evaluate threw exception: SyntaxError: Unexpected end of input
> (Session info: chrome=71.0.3578.80)
> (Driver info: chromedriver=2.42.591071 (0b695ff80972cc1a65a5cd643186d2ae582cd4ac),platform=Linux 4.15.0-43-generic x86_64)
>
>
>
> Must I use the url?
>
>
> --
> You received this message because you are subscribed to a topic in the Google Groups “Ruby on Rails: Talk” group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/rubyonrails-talk/CpOPHz-zFsc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to rubyonrails-ta...@googlegroups.com.
> To post to this group, send email to rubyonra...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/acaef991-80d3-4c6c-bccc-fdc017ef4734%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> Rafael Belo
> Web Developer
> Skype: rafaelrpbelo
> Twitter: @rafaelrpbelo
> Linkedin: rafaelrpbelo
>
> This is the first time I’m hearing Capybara recommended for web scraping Is this the preferred method for what I’m trying to do?
>
> --
> You received this message because you are subscribed to a topic in the Google Groups “Ruby on Rails: Talk” group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/rubyonrails-talk/CpOPHz-zFsc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to rubyonrails-ta...@googlegroups.com.
> To post to this group, send email to rubyonra...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/6d8212f3-dae8-444c-9004-28a4c5b0b103%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> Rafael Belo
> Web Developer
> Skype: rafaelrpbelo
> Twitter: @rafaelrpbelo
> Linkedin: rafaelrpbelo
>
> I need to work in rails console and when I run `visit ...` rails complains of no matching route
>
> --
> You received this message because you are subscribed to a topic in the Google Groups “Ruby on Rails: Talk” group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/rubyonrails-talk/CpOPHz-zFsc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to rubyonrails-ta...@googlegroups.com.
> To post to this group, send email to rubyonra...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/2ae5f27c-eb96-4de2-bc2d-0974387b21c7%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> Rafael Belo
> Web Developer
> Skype: rafaelrpbelo
> Twitter: @rafaelrpbelo
> Linkedin: rafaelrpbelo
>
> require ‘capybara/rails’
> include Capybara::DSL
>
> No change I still get the same routing error ActionController::RoutingError (No route matches [GET] “/getI2iRecommendingResults.do”):
>
>

You'll probably get better answers if you show your work. Try writing a single script that demonstrates what you want to do, and post it as a Gist. Link it here, show what the output looks like, and see where that leads you. Often times, working in the constraints of making the example work in a single script forces you to reconsider the problem, or shows you a simple error you made while configuring something more complex.

Walter

I'd like to write a script after I get my commands down, for now I'm working in rails console It may not be the intended use of capybara to visit url's not defined in routes.rb Should I be using something else to make the request

Try using an API tool, like Faraday.

gem ‘faraday’
require ‘faraday’
response = Faraday.get(‘https://entire.url.of/your/api/data.json’)

whatever_parsing_tool_you_want.parse(response.body)

Walter

Thanks That works The returned data is delimited as “name”:value,…

#(Text "/**/jQuery18307882644047005491_1545806199753({“success”:true,“code”:0,“results”:[{“productId”:32617749905,“sellerId”:228628782,“oriMinPrice”:“US $363.00”,“oriMaxPrice”…

Am I gonna have to regex my way through it?

I used Nokogiri::HTML.parse(response.body) It isn’t converting the javascript response to something more friendly

You won’t have some friendly parsed javascript response. The javascript it’s not information itself, it’s a lot of command the will handle browser’s DOM. That’s why we’re using a driver to get this informations.

If you get request and parse it, you’ll get the raw html with javascript code, but if you use a driver, then it’ll get the response and execute the loaded javascript. This is the key.