Parsing html files => putting them in fixtures for testing

I'm using Hpricot parser to scrape web pages. I saved two of these pages for a test in lack of a better way, I put the html files in the fixtures like this:

dl_found_tickets:   html: "<%= File.read( 'test/fixtures/html/ search_dl_found_tickets.html' ).gsub('"', '\"') %>"   [...]

Even though the crawl class works fine, the test fails, so it's got to be something wrong with the fixture. I a test that compares the html string loaded from the fixture and the one loaded from the fixture and they're not the same. From what I can tell, it's only whitespace difference, from some end-of-line conversions, I guees.

This test fails:   def test_html_fixtures     assert_equal File.read( 'test/fixtures/html/ search_plate_found_ticket.html' ).slice(0, 250), crawls (:dl_found_tickets).html.slice(0, 250)   end

  1) Failure: test_html_fixtures(CrawlTest)     [test/unit/crawl_test.rb:16:in `test_html_fixtures'      /usr/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/ active_support/testing/setup_and_teardown.rb:60:in `__send__'      /usr/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/ active_support/testing/setup_and_teardown.rb:60:in `run']: <"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0//EN\">\r\n<!--Bean tags and additional tags for use in this page.-->\r\n\r\n\r\n\r\n\r\n\r\n\r \n\r\n\r\n<!--End of bean tags and additional tags for use in this page.-->\r\n<html>\r\n<head>\r\n<link rel=\"stylesheet\" type=\"text/ css\" h"> expected but was <"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0//EN\"> <!--Bean tags and additional tags for use in this page.-->\n\n\n\n\n\n\n\n<!--End of bean tags and additional tags for use in this page.--> <html> <head> <link rel=\"stylesheet\" type=\"text/css\" href=\"css/style">.

you are right. it seems like an error that occurrs after reading your yml-file. obviously there are some additional whitespaces/carriage returns added to it. but you could just gsub them.

but other than that, one question: why do you want to save a html string in a yml-file (and not just read your html file whenever you want to)?

you are right. it seems like an error that occurrs after reading your yml-file. obviously there are some additional whitespaces/carriage returns added to it. but you could just gsub them.

It may not be just the whitespace, because the parser gives different results on the YML string and the File.read string. Isn't there a function to quote YML strings? I searched for it, and could not find it.

but other than that, one question: why do you want to save a html string in a yml-file (and not just read your html file whenever you want to)?

I like it this way because I just load the object from the fixture in my tests.