The XPath number() function should also be capable of converting booleans to
numbers, something it previously was not able to do. In order to do this
reliably we can't rely on the string() function as this would make it impossible
to distinguish between literal string values and booleans. This is due to
true(), which returns a TrueClass, being converted to the string "true". This
string in turn can't be converted to a float.
When an attribute is prefixed with "xml" the default namespace should be used
automatically. This namespace is not registered on element level by default as
this namespace isn't registered manually, instead it's a "magic" namespace. This
also ensures we match the behaviour of libxml more closely, hopefully reducing
confusion.
When calling the string() XPath function floats with zero decimals (10.0, 5.0,
etc) should result in a string without any decimals. Ruby converts 10.0 to
"10.0" whereas XPath expects "10".
The particular case of string(10) having to return "10" instead of "10.0" will
be handled separately. Returning integers breaks behaviour/expectations
elsewhere.
This reverts commit 431a253000.
This allows filtering of nodes by indexes (e.g. using last() or a literal
number) and uses the correct index range (1 to N in XPath). The function
position() is not yet supported as this requires access to the current node,
which isn't passed down the call stack just yet.
Namespaces aren't scoped per document but instead per element, thus this method
doesn't make that much sense. This also fixes the remaining, failing XPath test.
The method XPath::Evaluator#node_matches? now has a special case to handle
"type-test" nodes. This in turn fixes a bunch of failing tests such as those for
the XPath query "parent::node()".
Unlike what I thought before syntax such as "node()" is not a function call.
Instead this is a special node test that tests the *types* of nodes, not their
names.
This separates namespace handling into namespace names and namespace objects.
The namespace objects are retrieved from the element an attribute belongs to.
Once retrieved the namespace is cached, due to the overhead of retrieving
namespaces in large documents.
The old code used for generating Object#inspect values has been ripped out (for
the most part). The result is a non indented but far more compact #inspect
output. The code for this is also easier and doesn't break the signature of
Object#inspect.
Oga won't be handling URIs any time soon. The rationale is that they server zero
purpose when it comes to just parsing XML. Another goal of Oga is to make it
easy to modify and reserialize documents back to XML. If namespaces would also
store the URIs this would make this process more difficult.
This also comes with some small cleanups regarding
XPath::Evaluator#node_matches?. This change removes the need to, every time,
also use can_match_node?() to prevent NoMethodError errors from popping up.
This still uses a stack but at least no longer relies on the call stack. I
decided not to go with the Morris in-order algorithm [1] as it modifies the tree
during a search. This would not work well if a document were to be accessed from
multiple threads at once (which should be possible for read-only operations).
I might change this method to actually perform a search (opposed to just
returning everything). This will require some closer inspection of the
available XPath axes to determine if this is needed.
Tests will also be added once I've taken care of the above.
[1]: http://en.wikipedia.org/wiki/Tree_traversal#Morris_in-order_traversal_using_threading
The previous commit was nonsense as I didn't understand XPath's "following" axis
properly. This commit introduces proper tests and a note for future me so that I
can implement it properly.