obfuscated email not really obfuscated. but why not?

sol.manager · March 23, 2011, 4:39am

I have a page with an email address visible (so humans can print the page if necessary). I used the following to code to obfuscate the email. When I view the page source in my browser it appears all is well, but I was told today by the SEO person at our web developer that the email address is not obfuscated on this page. He had a printout with the email address clearly visible after some obfuscated text.

So, is the following incorrect in some way I just can't see?

Email: <%= mail_to @post.employer.email, @post.employer.email, :encode => "javascript", :subject => 'request for information: '+ @post.title %><% end %>

Frederick_Cheung · March 23, 2011, 8:37am

I have a page with an email address visible (so humans can print the page if necessary). I used the following to code to obfuscate the email. When I view the page source in my browser it appears all is well, but I was told today by the SEO person at our web developer that the email address is not obfuscated on this page. He had a printout with the email address clearly visible after some obfuscated text.

What does the output look like if you view the HTML source in your browser?

Fred

sol.manager · March 23, 2011, 2:29pm

For example, on the web page the following Email: joe.public@gmail.com had the following source code.

<li>Email: <script type="text/javascript">eval(unescape('%64%6f %63%75%6d%65%6e%74%2e%77%72%69%74%65%28%27%3c%61%20%68%72%65%66%3d %22%6d%61%69%6c%74%6f%3a%6a%6f%65%2e%70%75%62%6c%69%63%40%67%6d %61%69%6c%2e%63%6f%6d%3f%73%75%62%6a%65%63%74%3d%6a%6f %62%25%32%30%61%70%70%6c%69%63%61%6e%74%25%32%30%72%65%73%75%6d %65%25%32%30%66%6f%72%25%32%30%70%6f%73%74%25%32%30%6f%6e%25%32%30%6a %6f%62%66%69%6e%64%65%72%75%73%61%2e%63%6f%6d%25%33%41%25%32%30%53%6f %6c%75%74%69%6f%6e%73%25%32%30%41%73%73%69%73%74%61%6e%74%22%3e%6a%6f %65%2e%70%75%62%6c%69%63%40%67%6d%61%69%6c%2e%63%6f%6d%3c%2f%61%3e %27%29%3b'))</script></li>

To me this seems obfuscated, but the SEO person produced a print out with something similar above but looked more like: after the </script> and before the </li> his print out had href="mailto:joe.public@gmail.com?subject=job %20application">joe.public@gmail.com

I didn't know if this was a difference in web browsers or how he was able to see this, but he did.

Bryan_Crossland · March 23, 2011, 2:57pm

That’s a good question. What browser and version did he produce that on?

B.

sol.manager · March 23, 2011, 3:33pm

The problem seems to be that use was using Firebug add-on for Firefox and was viewing the page in debug mode, so essentially he was seeing the "front" and the "back" at the same time. Robots don't crawl the front, the crawl the source. So in the end, I believe this was operator error and not incorrect obfuscation of an email.

Philip_Hallstrom · March 23, 2011, 4:09pm

The problem seems to be that use was using Firebug add-on for Firefox and was viewing the page in debug mode, so essentially he was seeing the "front" and the "back" at the same time. Robots don't crawl the front, the crawl the source. So in the end, I believe this was operator error and not incorrect obfuscation of an email.

This is also true if you use Safari/Chrome's developer inspector. A pure view source will show you the javascript mess. Inspecting the element will show you the result of the javascript call...

-philip

Michael_Pavling · March 24, 2011, 9:09am

Really, don't even bother. Firstly, you're wrong in your assertion that "Robots don't crawl the front, they crawl the source" - nice simple robots may well only look at the source. But it's well known that the big search engines can determine if sneaky JS or CSS methods have been used to stuff keywords into source, but hide them from view.

Secondly, you have no idea what *nasty* robots are doing - and I assume they're the ones you don't want getting the email addresses from your page (for spamming, etc). There's no reason not to assume that robots don't view your whole site exactly as users do, including ignoring robots.txt files - in fact, a robots.txt file is the first thing I would look at if I want to know where the juicy stuff might be...

Just work under the premise that whatever works for your users will work for robots - if the user can click a mailto link, or read a legible email address, so can a robot, whatever obfuscation you've tried.

In fact, rather than foiling robots, your method discriminates against real users who don't have JS-enabled browsers.

If you *really* want to delay spammers, then render email addresses like "pavling(at)gmail(dot)com" - or some similar method that is deducible by humans, but unfamiliar enough to not be easily parsed by scripts (until loads of people use the method, and it's worth having the script look for matching patterns too...) - of course, users can no longer click-to-send, and I don't think it's worth the hassle.

Life's too short - use a good spam filter, and don't worry about it.

Topic		Replies	Views
can emails be read from spiders if they are embedded in a ruby .rhtml file? rubyonrails-talk	0	112	September 17, 2007
Encode and Encrypt Email Addresses rubyonrails-talk	6	557	September 30, 2009
how to handle the final output page? rubyonrails-talk	3	134	December 6, 2007
Mass filtering page content for javascript anti-spam script rubyonrails-talk	3	138	November 13, 2006
how to encode an email ? (spam protection) rubyonrails-talk	1	128	June 19, 2007

obfuscated email not really obfuscated. but why not?

Related topics

More Resources