XHTML vs. HTML4

I was asked today about tag helpers automatically outputting in XHTML and if there was a setting to make them not do this. The reason? The web developer prefers HTML4 vs XHTML. So, is there a Rails setting for this or would he have to override every method? I believe that the browser will parse it no matter what, but he wants it to be “100% compliant”

http://agilewebdevelopment.com/plugins/html4ify

Installing this plugin will make all tags default to the non-closed tags HTML4 uses. Although I’d like to know why a web developer would ever prefer HTML4 over XHTML, developers usually prefer a clearly structured and well defined set of rules instead of something where rules are so forgiving one might wonder if it can still be called a set of rules.

Best regards

Peter De Berdt

Some people prefer ugliness to beauty.

Many kudos.

<...>

Although I'd like to know why a web developer would ever prefer HTML4 over XHTML, developers usually prefer a clearly structured and well defined set of rules instead of something where rules are so forgiving one might wonder if it can still be called a set of rules.

Well, that may be because XHTML is HTML4, but just presented in XML; that HTML is supported in IE while XHTML is not, and that XHTML does not give any real advantages unless you are using some XML tools to produce/manipulate your code.

That, and the fact that XHTML sent as text/html works in browsers only because they did not bother to implement SGML properly. Otherwise every <br /> would be rendered as <br>&gt; and so on. Basically anyone using XHTML with text/html MIME type relies on the bug in browsers (and markup will be parsed by the html parser even in browsers supporting XHTML). Sending it with proper MIME brings another set of issues to be aware of, not least being that IE (including IE8) does not support it.

I myself prefer HTML 4.01 Strict - it is as strict as xhtml and does not rely on any bugs. True, HTML has more flexible syntax if you ever need that (e.g. to shave off a couple of bytes), but you are still free to close all your LIs and Ps.

The fact that Rails defaults to XHTML would not bother me much if there was an easy way to configure it to use HTML mode, alas…

Regards, Rimantas

Although I’d like to know why a web developer would ever prefer HTML4 over XHTML, developers usually prefer a clearly structured and well defined set of rules instead of something where rules are so forgiving one might wonder if it can still be called a set of rules.

Well, that may be because XHTML is HTML4, but just presented in XML; that HTML is supported in IE while XHTML is not, and that XHTML does not give any real advantages unless you are using some XML tools to produce/manipulate your code.

Well, some of our applications are scraped by other (desktop) applications. Those applications benefit from the XML notation, since they can just run the webpage through the XML parser and run over the nodes they need. These are older apps, that either don’t have an REST-based API in place or where the third party development team had no experience with WebServices (they exist, believe me).

That, and the fact that XHTML sent as text/html works in browsers only because they did not bother to implement SGML properly. Otherwise every
would be rendered as
> and so on. Basically anyone using XHTML with text/html MIME type relies on the bug in browsers (and markup will be parsed by the html parser even in browsers supporting XHTML). Sending it with proper MIME brings another set of issues to be aware of, not least being that IE (including IE8) does not support it.

I myself prefer HTML 4.01 Strict - it is as strict as xhtml and does not rely on any bugs. True, HTML has more flexible syntax if you ever need that (e.g. to shave off a couple of bytes), but you are still free to close all your LIs and Ps.

I don’t care how a browser interprets it tbh, I know not sending it with the correct headers makes browsers interpret it just like HTML. I prefer having the doctype keep me (and our development team) in line. We’re protecting ourselves against… well… ourselves actually. We all know the problems that arise in a team about how to name variables. How to name variables: titleCase or under_scored or … JavaScript code usually names them one way, Ruby code uses another convention, …

Using XHTML forces the team to adhere to the conventions, there is no choice. It keeps the views pretty uniform, no matter who implemented that section.

That said, everyone nowadays can basically use whatever they prefer. A real choice will only have to be made when HTML5 and XHTML2 are finalized and implemented in all browsers, since they are focussing on totally different issues.

The fact that Rails defaults to XHTML would not bother me much if there was an easy way to configure it to use HTML mode, alas…

There is, use the plugin I posted in an earlier message and everything is HTML4 compliant, unless I’m missing something?

Best regards

Peter De Berdt

Well, some of our applications are scraped by other (desktop) applications. Those applications benefit from the XML notation, since they can just run the webpage through the XML parser and run over the nodes they need.

Uhm, they can just run, or do they just run webpages through XML parser? Or is it regexp engine? :slight_smile: The reason I am asking, that too many pages with XHTML doctype are, in fact broken, and would end up with 'yellow screen of death' if were parsed by xhtml engine in browser, not the forgiving html engine. If you are using xml tools to process your xhtml pages, then congrats, you do have high quality here.

<...>

Using XHTML forces the team to adhere to the conventions, there is no choice. It keeps the views pretty uniform, no matter who implemented that section.

Well, it is sure the matter of preferences. Keeping tags lowercase and to close paragraphs is not that difficult in HTML either :slight_smile:

That said, everyone nowadays can basically use whatever they prefer. A real choice will only have to be made when HTML5 and XHTML2 are finalized and implemented in all browsers, since they are focussing on totally different issues.

My bet is on HTML5.

The fact that Rails defaults to XHTML would not bother me much if there was an easy way to configure it to use HTML mode, alas...

There is, use the plugin I posted in an earlier message and everything is HTML4 compliant, unless I'm missing something?

Well, not. I'd just prefer having one line in config vs. plugin. Anyway, this is not something to loose sleep over.

Regards, Rimantas

XML parsers will go through a HTML4 doc, inevitably get to a tag like <br> and then look for a matching </br> tag. Of course, there isn't going to be one. This is where XHTML comes in with the handy / at the end of tags, such as <br />, so XML parsers won't look for a </br>

I understand that. But I wonder, what happens when said XML parser runs onto some in paga JavaSrcipt not included in CDATA and sees something like "if (x < y)". Or it comes across a little &copy; or &trade; what happens then? Pages like this will appear just fine in browsers (if served with text/html) thanks to the forgiving html parser engine, but XML parser should not be that forgiving. That's why I'd go wit hpricot, no matter what doctype says. Well, with possible exception when XHTML is produced with XML tools too (there is an opinion, that one should never write XML by hand).

Regards, Rimantas