How to search and replace all urls on a HTML string using RUBY gsub

Hi ,

         I trying to search and replace all urls on a HTML string using gsub .

CODE

html = "<a href='http://site.com.br'><img src='http://site3.com/ image.jpg'></a><a href='http://newx.com.br'><img src='http://localhost/ imagem.jpg'></a>";

pattern = /<a href=[\'"]?([^\'"> ]*)[\'"]?[^>]*>(.*?)<\/a>/mo

replace = "<a href='http://mysite.com/redirect/#\{$1\}&#39;&gt;\#\{$2\}&lt;/a&gt;&quot;

html_output = html.gsub(pattern,replace)

The REGEX pattern is apparently working but I 'm not getting the values of $1 and $2 . When I use \\1 and \\2 it works . Thing is ... I need to encode $1 variable like this replace = "<a href='http://mysite.com/redirect/#\{Base64.encode64 ($1)}'>#{$2}</a>"

and I not able to encode \\1

Can anyone help me ?

Hi Siddick ,

           Thanks a lot for your response but ...            It didn't work . html_output outputs the same content as html variable .

Newton

Based in your solution and searching on internet I tryied the following aproach and it did work this time .

pattern5 = /<a href=[\'"]?([^\'"> ]*)[\'"]?[^>]*>(.*?)<\/a>/o

html.gsub!(pattern5) do |n|     "<a href='#{Base64.encode64($1)}'>#{$2}</a>" end

puts html

I am now able to use $1 and $2

Thanks a lot .

Newton Garcia Newx - Soluções para Internet newx@newx.com.br www.newx.com.br