catch bad image urls from user provided content

Hello all,

Currently, we have a membership website application where our customers
create a membership based website secured by a login. We allow our
customers to create pages which they provide html for. Well, sometimes
they put incorrect urls for their html (ie.. mostly images:
/images/blabla.gif), which Rails thinks is trying to pull from our
server. It then throws a "No route error". With images these are just
shown in the error logs and not to the end user.

I'm wondering if there is a good way to either:

1. Catch the image url error as a customer provides the html for their
page

--OR--

2. Catch so it isn't logged.

Also, sometimes they link to javascript files which also can be
incorrect, so I guess it wouldn't only be images either. And is there a
way to tell a difference between "No route" errors?

Thanks.

Currently, we have a membership website application where our customers
create a membership based website secured by a login. We allow our
customers to create pages which they provide html for. Well, sometimes
they put incorrect urls for their html (ie.. mostly images:
/images/blabla.gif), which Rails thinks is trying to pull from our
server. It then throws a "No route error". With images these are just
shown in the error logs and not to the end user.

I'm wondering if there is a good way to either:

1. Catch the image url error as a customer provides the html for their
page

Scan the HTML when submitted using Nokogiri and look for bad SRC attributes in the IMG tags...

-philip

Great! so with Nokogiri I can get the src attribute of the image/script
tag.. but how do I test that the actual file/link exists?

Should I do something similar to this:

mylinks.each do |link|
  u = URI.parse link.href
  status_code = Net::HTTP.start(u.host,u.port){|http|
http.head(u.request_uri).code }
  if status_code != '200'
    #add error
  end
end

mylinks.each do |link|
u = URI.parse link.href
status_code = Net::HTTP.start(u.host,u.port){|http|
http.head(u.request_uri).code }
if status_code != '200'
   #add error
end
end

Maybe. Could be slow. Also, that wouldn't catch valid 301/302 redirects. I've also come across servers that freak when tested via ruby, but work fine via a real browser...

But yeah, if you have to make sure it's valid that's the way.

Would be easy to check to see if any are relative or internally inconsistent and whine about those...

Yeah.. I'll probably just check if they are referencing an internal/root
url and complain only about those.

Thanks Philip!