Opened 11 years ago

Closed 11 years ago

#1038 closed Bugs (fixed)

"<" and ">" should be excaped in xml_oarchive

Reported by: r.buergel@… Owned by: Robert Ramey
Milestone: Boost 1.36.0 Component: serialization
Version: Severity: Cosmetic
Keywords: Cc:

Description

Using templates in BOOST_CLASS_EXPORT fails for XML Archives, because the characters "<" and ">" enclosing the template-parameters aren't escaped. Using templates for other Archives works, so this should be corrected for XML Archives. The xml_iarchive is already correct and understands &gt; respective &lt;.

I could reproduce the bug for Boost 1.33.1 and 1.34.0. I'll append a testcase, where the Bug should be reproducable and a patch to fix it.

Yours sincerely, René Bürgel

Attachments (2)

Test.cpp (1.2 KB) - added by r.buergel@… 11 years ago.
Testcase
300-writing-lt-gt.patch (369 bytes) - added by r.buergel@… 11 years ago.
patch

Download all attachments as: .zip

Change History (10)

Changed 11 years ago by r.buergel@…

Attachment: Test.cpp added

Testcase

Changed 11 years ago by r.buergel@…

Attachment: 300-writing-lt-gt.patch added

patch

comment:1 Changed 11 years ago by Eric Niebler

Owner: set to Robert Ramey

comment:2 Changed 11 years ago by Robert Ramey

"Using templates for other Archives works, so this should be corrected for XML Archives."

I can't see how this would be true. I'll look into it

Robert Ramey

comment:3 in reply to:  2 Changed 11 years ago by r.buergel@…

Did you find some time for it?

comment:4 Changed 11 years ago by Robert Ramey

Resolution: fixed
Status: newclosed

I've examined your test case, which boils down to:

typedef B<int> bint; BOOST_CLASS_EXPORT( B<int> )

With your suggestion, this would generate xml that looks like;

<B&lt;int&gt; ...>...</B&lt;int&gt;>

Is this legal xml?

Assuming it is, is this good idea?

Without making a better case for this, I would be disinclined to alterh the library.

Robert Ramey

comment:5 Changed 11 years ago by r.buergel@…

Resolution: fixed
Status: closedreopened

It looks like we're talking about two different things here.

The suggested patch just changed the generated xml from <b class_id="1" class_name="B<int>" ...> to <b class_id="1" class_name="B&lt;int&gt;" ...>

I'm not sure, if class_name="B<int>" is valid XML, but i think so. At least it is unambiguous, because the angle brackets are enclosed by quotation marks. So, at a closer look, the xml-parser may be the real cause of the problem and my patch is more the workaround for this.

The case you are describing above doesn't come from using "BOOST_CLASS_EXPORT ( B<int> )", but from "ar & BOOST_SERIALIZATION_BASE_OBJECT_NVP( B<int> )". I added it to my testcase and at a quick view, that also seems to work for other archive types like text_archive, but fails for xml_archives. But for that case, the xml_oarchive fails, not the iarchive. So it is recognized when writing the serialization, and not until reading it.

I'd like to find a solution here. In the best case, the xml-reader is fixed to read class_name="B<int>", IF that is valid xml. But at least, the xml_oarchive should not be able to create archives, that can't be read by the xml_iarchive.

comment:6 Changed 11 years ago by r.buergel@…

Ok, i looked it up in the XML specs ( http://www.w3.org/TR/2006/REC-xml-20060816/ ), what is valid xml and what not.

<b class_id="1" class_name="B<int>" ...> is invalid, because the left angle bracket is forbidden in attribute values. The escaped variant <b class_id="1" class_name="B&lt;int&gt;" ...> is valid.

The case you described above ( <B&lt;int&gt; ...>...</B&lt;int&gt;> ) is clearly invalid. But it didn't work before my patch and it doesn't work after applying it, because the xml_oarchive throws an exception, which states "Unrecognized XML Syntax".

So, currently it is possible to write invalid xml with you archive without getting trapped. But reading that invalid xml-file causes the application to fail.

comment:7 Changed 11 years ago by Robert Ramey

OK

First, thanks for spending time on this. We're now at the point where you've looked at it more deeply than I have. Does your patch:

a) create class_name="B&lt;int&gt" on output b) and permit class_name="B&lt;int&gt" in input? c) trap usage of < and > in variable names on output?

If not, can it be enhanced to do so? This to me would be a definitive solution to the the problem and guarentee that any archive that is successfully written with xml_oarchive can be successfully read by xml_iarchive - which is where I want the real situation to be.

Robert Ramey

Would this not resolve the issue

comment:8 Changed 11 years ago by Robert Ramey

Milestone: To Be DeterminedBoost 1.36.0
Resolution: fixed
Status: reopenedclosed

OK, you've convinced me. I'm putting it in and testing it.

Robert Ramey

Note: See TracTickets for help on using tickets.