XPath 2.0 deep-equal Does Not Match Like Expected – The Problem With Whitespace
I just stumbled accros a problem with the
deep-equal()-method introduced by
It costs me two hours at minimum to find out, what was going on.
So I want to share this with you, in case your are wasting time on the same problem and try to find a solution via google ;)
If you never heard of
deep-equal() and just wonder how to compare XML-nodes in the right way, you should probably read this exelent article about equality in XSLT as a starter.
My problem was, that I wanted to parse/output a node only, if there exists no node on the
ancestor-axis, that has a exact duplicate of that node as a direct child.
The Difference Between A Comparison With
= And With
If you just use simple equality (with
eq), the two compared nodes are converted into strings implicitly.
That is no problem, if you are comparing attributes, or nodes, that only contain text.
But in all other cases, you will only compare the text-contents of the two nodes and their children.
Hence, if they differ only in an attribute, your test will report that they are equal, which might not be what you are expecting.
For example, the XPath-expression
will match the
@id='bar', that is nested insiede the
<child>-node in this example-XML, what I was not expecting:
<root> <parent> <ref id="foo"><content>Same Text-Content</content></ref> <child> <ref id="bar"><content>Same Text-Content</content></ref> </child> <parent> <list>
So, what I tried, after I found out about
deep-equal() was the following
Xpath-expression, which solves the problem in the above example:
The Unexpected Behaviour Of
But, moving on I stumbled accross cases, where I was expecting a match, but
deep-equal() does not match the nodes.
<root> <parent> <ref id="same"> <content>Same Text-Content</content> </ref> <child> <ref id="same"> <content>Same Text-Content</content> </ref> </child> <parent> <list>
You probably catch the diffrenece at first glance, since I laid out the examples accordingly and gave you a hint in the heading of this post – but it really took me a long time to get that:
It is all about whitespace!
deep-equal() compares all child-nodes and only yields a match, if the compared nodes have exactly the same child-nodes.
But in the second example, the compared
<ref>-nodes contain whitespace befor and after their child-node
And these whitespace are in fact implicite child-nodes of type text.
Hence, the two nodes in the second example differe, because the indentation on the second one has two more spaces.
Unfortunatly, I do not really know a good solution. (If you come up with one, feel free to note or link it in the comments!)
The best solution would be an option additional argument for
deep-equal(), that can be selected to tell the function to ignore such whitespace.
In fact, some XSLT-parsers do provide such an argument.
The only other solution, I can think of, is, to write another XSLT-script to remove all the whitespaces between tags to circumvent this at the first glance unexpected behaviour of
Funded by the Europian Union
This article was published in the course of a resarch-project, that is funded by the European Union and the federal state Northrhine-Wetphalia.