I have a page with an email address visible (so humans can print the
page if necessary). I used the following to code to obfuscate the
email. When I view the page source in my browser it appears all is
well, but I was told today by the SEO person at our web developer that
the email address is not obfuscated on this page. He had a printout
with the email address clearly visible after some obfuscated text.
So, is the following incorrect in some way I just can't see?
Email: <%= mail_to @post.employer.email, @post.employer.email, :encode
=> "javascript", :subject => 'request for information: '+ @post.title
%><% end %>
I have a page with an email address visible (so humans can print the
page if necessary). I used the following to code to obfuscate the
email. When I view the page source in my browser it appears all is
well, but I was told today by the SEO person at our web developer that
the email address is not obfuscated on this page. He had a printout
with the email address clearly visible after some obfuscated text.
What does the output look like if you view the HTML source in your browser?
Fred
For example, on the web page the following Email: joe.public@gmail.com
had the following source code.
<li>Email: <script type="text/javascript">eval(unescape('%64%6f
%63%75%6d%65%6e%74%2e%77%72%69%74%65%28%27%3c%61%20%68%72%65%66%3d
%22%6d%61%69%6c%74%6f%3a%6a%6f%65%2e%70%75%62%6c%69%63%40%67%6d
%61%69%6c%2e%63%6f%6d%3f%73%75%62%6a%65%63%74%3d%6a%6f
%62%25%32%30%61%70%70%6c%69%63%61%6e%74%25%32%30%72%65%73%75%6d
%65%25%32%30%66%6f%72%25%32%30%70%6f%73%74%25%32%30%6f%6e%25%32%30%6a
%6f%62%66%69%6e%64%65%72%75%73%61%2e%63%6f%6d%25%33%41%25%32%30%53%6f
%6c%75%74%69%6f%6e%73%25%32%30%41%73%73%69%73%74%61%6e%74%22%3e%6a%6f
%65%2e%70%75%62%6c%69%63%40%67%6d%61%69%6c%2e%63%6f%6d%3c%2f%61%3e
%27%29%3b'))</script></li>
To me this seems obfuscated, but the SEO person produced a print out
with something similar above but looked more like:
after the </script> and before the </li> his print out had
href="mailto:joe.public@gmail.com?subject=job
%20application">joe.public@gmail.com
I didn't know if this was a difference in web browsers or how he was
able to see this, but he did.
That’s a good question. What browser and version did he produce that on?
B.
The problem seems to be that use was using Firebug add-on for Firefox
and was viewing the page in debug mode, so essentially he was seeing
the "front" and the "back" at the same time. Robots don't crawl the
front, the crawl the source. So in the end, I believe this was
operator error and not incorrect obfuscation of an email.
The problem seems to be that use was using Firebug add-on for Firefox
and was viewing the page in debug mode, so essentially he was seeing
the "front" and the "back" at the same time. Robots don't crawl the
front, the crawl the source. So in the end, I believe this was
operator error and not incorrect obfuscation of an email.
This is also true if you use Safari/Chrome's developer inspector. A pure view source will show you the javascript mess. Inspecting the element will show you the result of the javascript call...
-philip
Really, don't even bother.
Firstly, you're wrong in your assertion that "Robots don't crawl the
front, they crawl the source" - nice simple robots may well only look
at the source. But it's well known that the big search engines can
determine if sneaky JS or CSS methods have been used to stuff keywords
into source, but hide them from view.
Secondly, you have no idea what *nasty* robots are doing - and I
assume they're the ones you don't want getting the email addresses
from your page (for spamming, etc). There's no reason not to assume
that robots don't view your whole site exactly as users do, including
ignoring robots.txt files - in fact, a robots.txt file is the first
thing I would look at if I want to know where the juicy stuff might
be...
Just work under the premise that whatever works for your users will
work for robots - if the user can click a mailto link, or read a
legible email address, so can a robot, whatever obfuscation you've
tried.
In fact, rather than foiling robots, your method discriminates against
real users who don't have JS-enabled browsers.
If you *really* want to delay spammers, then render email addresses
like "pavling(at)gmail(dot)com" - or some similar method that is
deducible by humans, but unfamiliar enough to not be easily parsed by
scripts (until loads of people use the method, and it's worth having
the script look for matching patterns too...) - of course, users can
no longer click-to-send, and I don't think it's worth the hassle.
Life's too short - use a good spam filter, and don't worry about it.