Builder (XML)

Hello!

I've found what appears to be a bug in the builder library, but don't know where to report it. The problem occurs when using indentation (i.e. :indent > 0) and when the most deeply nested command contains a block that returns just text, i.e. if it uses << or contains a Builder::XmlMarkup.new that targets the builder object above it. This causes the indentation to break down, placing a newline after the opening tag, setting the text against the left margin, and then indenting the closing tag.

The source of the problem is the loop between the block part of the method_missing method and the _nested_structures method in xmlbase.rb. It works if the lowest level command contains a simple text argument (e.g. xml.name(author.name)), but not otherwise. I have worked around it by redefining text!(text) as _text(text), thus switching off all escaping of text, but that is not a fix.

It would, however, be useful to have a global option that turns off escaping, so that one could simply use the ordinary text argument. I am trying to produce human-readable documents, hence the indentation, and don't want my accented letters and typographical symbols turned into numbers. In addition, some of my input contains presentational markup elements, such as <emphasize>, <superscript> etc, that should not be escaped.

I am not a professional programmer, and not good enough to offer a fix to the problem (although I could probably manage setting up the global option). This does, however, make me realize that thanks to Rails I am actually doing things that I would not even have thought of attempting a couple of years ago, so many thanks to all who work on such an excellent project.

Best wishes for the future.

Hello!

I've found what appears to be a bug in the builder library, but don't know where to report it. The problem occurs when using indentation (i.e. :indent > 0) and when the most deeply nested command contains a block that returns just text, i.e. if it uses << or contains a Builder::XmlMarkup.new that targets the builder object above it. This causes the indentation to break down, placing a newline after the opening tag, setting the text against the left margin, and then indenting the closing tag.

Is it a case of calling _indent (which generates the spaces at the
start of a line) or of creating the new instance of XmlMarkup with the
appropriate indent and level ? If you turn off escaping you may produce invalid xml.

Fred

Hello Fred,

No, it's a case of the loop only exiting properly if the last command doesn't have a simple argument but calls another block. If there is no indentation, then there is no extra space to put in and all appears to work fine, but with indentation switched on the indentation gets placed between the text and the end tag. There is no problem about the unescaped XML. It is all produced in utf8 and validates against the DocBook 5.0 schema (the documentation of which appears to encourage the direct entering of unescaped text in utf8).

I have examined the code in xmlbase.rb, and the problem is clearly what ought to happen given the way the code is written, i,e,

_indent _start_tag _newline _nested_structures _indent _end_tag

So you get an indented start tag, a newline, no indent for the text (because there is no tag to indent), and an indented end tag. If you include tags within the block, then you simply shift the problem one level up.

Hope this helps. Thanks, demo

Dear Fred,

I realized I should have provided some code to reproduce the bug. This will do it:

xml = Builder::XmlMarkup.new(:indent => 2) xml.author do xml.name do |n| n << 'Charles Dickens' end end

Best wishes, demo

Dear Fred,

I realized I should have provided some code to reproduce the bug. This will do it:

xml = Builder::XmlMarkup.new(:indent => 2) xml.author do xml.name do |n| n << 'Charles Dickens'

This isn't a bug. The very definition of << is that it appends text to the output completely unmolested. Like i said you can use _indent, _newline etc... if you want to generate that whitespace. I suppose you could override _escape to be a no-op if you had to.

Fred

end end

B

Dear Fred,

The whole point is that it does produce the whitespace! This is not what I want. I want nicely formatted XML with the tags wrapped around the text, or at least aligned. The code I showed produces:

<author>   <name> Charles Dickens </name> </author>

As the level of nesting gets deeper, so the amount of whitespace grows. Using the following:

xml = Builder::XmlMarkup.new(:indent => 2) xml.author do xml.name('Alexander Pope') end

produces

<author>   <name>Alexander Pope</name> </author>

which is what I want, except that the outpur is escaped. I realize that this is a minor issue, but the library is not doing what it is supposed to do.

Best wishes, demo

Dear Fred,

The whole point is that it does produce the whitespace! This is not what I want. I want nicely formatted XML with the tags wrapped around the text, or at least aligned. The code I showed produces:

<author> <name> Charles Dickens </name> </author>

That white space doesn't come from <<, it's coming from the name tag
(ie it's assuming that after Charles dickens there would have been a
newline). << literally just appends the text you give it to the output. no
cleverness going on

If you add in the whitespace then it will format nicely, ie

xml = Builder::XmlMarkup.new(:indent => 2) xml.author do xml.name do |n| xml.__send__ :_indent n << 'Charles Dickens' xml.__send__ :_newline end

produces

<author>    <name>      Charles Dickens    </name> </author>

You could of course to redefine text!

class Builder::XmlMarkup    def text!(text)      _text(text)    end end

which allows you do to do

xml = Builder::XmlMarkup.new(:indent => 2) xml.author do xml.name('charles & dickens') end

and get the following invalid markup <author>    <name>charles & dickens</name> </author>

Fred