REXML UTF-16 trouble

I've run into a problem parsing XML with REXML, and it looks like the
problem has to do with UTF-16 encoding and a bug with REXML.

I'm still running OS X 10.4.11, Ruby 1.8.6 (using Locomotive). I even
tried upgrading to the latest version of REXML 3.1.7.3 - no luck. It
still gives errors, only now it's saying the following.

Iconv::InvalidCharacter: ">"

Is this old news to everyone? If so, is there known solution for
this?

I can't be the only person who needs to parse UTF-16 xml in Rails.

Thanks in advance.

Random thought - the XML file claims to be utf 16, but is it really?

Good question. How would I know?

If I stash the result straight into a variable and do a "puts" I get
the following…

<?xml version="1.0" encoding="utf-16"?>
<CallStatus><Code>InvalidPassword</Code>
<Success>false</Success>
<Message>Invalid password</Message>
</CallStatus>

Again, REXML chokes on this, but if I change utf-16 to utf-8, no
problem.

I have an interim Kludge for now. I'm chopping the BOM piece off of
the XML and then sending it through as XML via REXML. It's working
like a charm now. It's not pretty, but it'll do until I figure out if
this is part of a bigger problem.

# Since this part of the string is always going to be same, slice the
first 39 characters
# Off of it.
bom_string_to_remove = login_result.slice(0, 39)
login_results = login_result.gsub(bom_string_to_remove,'')

So this…
<?xml version="1.0" encoding="utf-16"?>
<CallStatus><Code>InvalidPassword</Code>
<Success>false</Success>
<Message>Invalid password</Message>
</CallStatus>

Becomes this…
<CallStatus><Code>InvalidPassword</Code>
<Success>false</Success>
<Message>Invalid password</Message>
</CallStatus>

And, now I treat it as XML.

Good question. How would I know?

I'd ogle the bytes via unpack or something like that.

Fred

Fred,

Thanks, I will give that a try -- eventually, I need this to work
without the workaround.

+bm

Was this ever solved? How do I load a UTF-16 xml file into REXML?

Thanks!
Naren