JSON::ParserError in controller

Hi All I'm trying to build an application which requires to scrap information from a webpage. On trying to perform the action, I get an error while trying to convert the html data to JSON. Has anyone experienced this before and if so can you please tell me how to solve this problem ? Please see below for code snippet and error log.

Thanks in advance Anush

require 'net/http' require 'open-uri' require 'uri' require 'json' require 'pp'

class Merchant < ActiveRecord::Base

  def self.grab_original_content     ## EXAMPLE USING ZED451.COM     uri = URI("http://www.zed451.com")     response = Net::HTTP.get_response(uri)     @hash = JSON(response.body)     puts "#{@hash}"   end

end

I call the above method in my controller and send @hash to view. In my browser I see the below error:

JSON::ParserError in Original contentController#index

706: unexpected token at '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd&quot;&gt;

And the rest of the page is printed without error in html format.

Hai,

Hi Jordon, Thanks for your response. I thought the JSON(response.body) performs the conversion of HTML->JSON. But I also tried response.body.to_json which gave me the same error. Will be great if you can explain a bit. Mean while I will also try using nokigiri.

Thanks Anush

Jordon Bedwell wrote in post #1091317:

You cannot convert HTML to JSON and vice versa. HTML is a markup language, while JSON is a data interchange format.

You need to parse your HTML with Nokogiri or Hpricot, extract whatever data you want from it and put it in a Hash, then call .to_json on it to get the JSON response.

Dheeraj Kumar wrote in post #1091355:

You cannot convert HTML to JSON and vice versa. HTML is a markup language, while JSON is a data interchange format.

You need to parse your HTML with Nokogiri or Hpricot, extract whatever data you want from it and put it in a Hash, then call .to_json on it to get the JSON response.

-- Dheeraj Kumar

Hi Dheeraj, Ahh..I see. Got it now. Thanks, helps a lot in understanding.

Thanks Anush