problem scraping using nokogiri - getting wrong characters

Hi all,

I am scraping a table off of another site and inserting it onto my
site. you can see an example on the initial page at:
I'm referring to the green box with the snowbird weather and snowfall

this box has been scraped off of the snowbird site at:

The problem is that on the snowbird site it has degree symbols (°) but
on my page it shows up as: (�)

I think it has something to do with the encoding but i'm pretty new to
html etc. and am not sure what i can do to fix this. I've tried
substituting the characters and some other things but haven't had any
success yet.

any ideas?




I opened the html source from the snowreport.php site and I noted that the strange symbols that you mentioned are htmlencoded

characters. The symbol is °

I had a similar problem on last Monday, but I couldn’t complete solve it.

Try the lib:

or use a regular expression (sub, gsub) to substitute ° for the degrees symbol.



i tried that but it didn't work for me. what did was to explicitly
set the encoding property in nokogiri

    url = ‘
    page = Nokogiri::HTML(open(url))
    page.encoding = ‘utf-8’

worked great after that!