Commit Graph

1148 Commits

Author SHA1 Message Date
Yorick Peterse bd48dc15cc Evaluate compiled blocks in an isolated Binding
Re-using the Binding of the XPath::Compiler#compile method would lead to
race conditions, and possibly a memory leak due to the Binding sticking
around for compiled Proc's lifetime.

By using a dedicated class (and its corresponding Binding) we can work
around this. Access to this class is not synchronized as compiled Procs
don't mutate their enclosing environment.

The race condition can be demonstrated using code such as the
following:

    xml = <<-EOF
    <people>
      <person>
        <name>Alice</name>
      </person>

      <person>
        <name>Bob</name>
      </person>

      <person>
        <name>Eve</name>
      </person>
    </people>
    EOF

    4.times.map do
      Thread.new do
        10_000.times do
          document = Oga.parse_xml(xml)

          document.at_xpath('people/person/name').text
        end
      end
    end.each(&:join)

Running this code would result in NoMethodErrors due to "at_xpath"
returning a NilClass opposed to an Oga::XML::Element.
2015-09-07 14:02:31 +02:00
Yorick Peterse b07c75e964 Moved comparing_gems_bench to xpath/compiler
This is a compiler benchmark, not a parser benchmark.
2015-09-06 22:14:17 +02:00
Yorick Peterse ac5cb3d24f Tweaked thread safety notice in the README
Querying the same document concurrently _could_ lead to problems, so
lets just recommend users to not even try this.
2015-09-06 19:30:40 +02:00
Yorick Peterse 4c79468091 Release 1.3.0 2015-09-06 19:20:45 +02:00
Yorick Peterse 791085302e Prepare changelog for 1.3.0 2015-09-04 16:46:33 +02:00
Yorick Peterse f753f08f18 Revamp CSS parser for better axis support
This makes it possible to parse expressions such as "foo>bar", "> .bar",
"> foo.bar", and similar expressions.

This fixes #126 and fixes #131.
2015-09-04 16:06:20 +02:00
Yorick Peterse c713f6250f Lexer/parser specs for CSS axes without whitespace 2015-09-04 15:13:38 +02:00
Yorick Peterse 08bc23905e Specs for lexing CSS operators with whitespace 2015-09-04 15:08:26 +02:00
Yorick Peterse 5f037c76cc Corrected CSS ends-with example
This was supposed to use the "$=" operator and not the "=" operator.
2015-09-04 14:38:45 +02:00
Yorick Peterse f5425b07e0 Added magic encoding comments for Ruby 1.9 2015-09-03 11:31:02 +02:00
Yorick Peterse 37c5b819fa Unicode support for CSS/XPath
Fixes #140
2015-09-03 11:21:45 +02:00
Yorick Peterse 44630c27ff Support escaping dots in CSS identifiers
Escaping hash characters and whitespace is _not_ supported as neither
are valid element/attribute names (e.g. <foo#bar /> is invalid
XML/HTML).

Escaping single/double quotes also won't be supported for the time
being. It's quite a pain to get this to work right in not just CSS but
also XPath and XML/HTML, for very little gain. Should there be enough
users with an actual use case (other than "But the spec says ...!") I'll
look into this again.

Fixes #124
2015-09-02 20:18:52 +02:00
Yorick Peterse aef7c510c2 Basic support for the CSS :not pseudo class
This does _not_ support element states such as DISABLED, nor does it
support the special handling of namespaces (e.g. *|*:not(*)). Instead
this selector basically acts as a negation, some examples:

    :not(foo)  # All but any "foo" nodes
    :not(#foo) # Skips nodes with id="foo"
    :not(.foo) # Skips nodes with a class "foo"

Fixes #125
2015-09-01 22:05:46 +02:00
Yorick Peterse b7b38255d3 Fixed YARD formatting 2015-09-01 20:03:56 +02:00
Yorick Peterse 94f8ed5421 Removed start/end comments of YARD blocks 2015-09-01 19:59:52 +02:00
Yorick Peterse 929a521641 Added better docs/examples to XML::Querying 2015-09-01 10:12:17 +02:00
Yorick Peterse 8b2455679f Revamp a few more XPath compiler specs 2015-08-31 09:39:33 +02:00
Yorick Peterse 604d0d9337 Case insensitive matching of nodes
This re-applies the patch added in #134 to the new XPath compiler.

Fixes #135.
2015-08-30 18:30:04 +02:00
Yorick Peterse bb8b328f5e Revamp compiler specs for regular paths 2015-08-30 18:26:52 +02:00
Yorick Peterse 67ada1168e Fix starts-with() for JRuby 1.7
''.start_with?('') returns false on JRuby 1.7. While I'd love to drop
support for shit like this, JRuby 1.7 is still in common use today, so
lets just work around this for now.
2015-08-30 02:10:49 +02:00
Yorick Peterse bf0ca7c907 Alias Ruby::Node#to_ary to #to_a
JRuby 1.7 uses to_ary opposed to to_a.
2015-08-30 02:06:10 +02:00
Yorick Peterse b74f8dc1a3 Removed compiler arity spec
This spec isn't very useful and breaks on 1.9 due to it apparently
handling arity values differently.
2015-08-30 01:55:31 +02:00
Yorick Peterse 435115c454 Removed various unused variables 2015-08-30 01:46:52 +02:00
Yorick Peterse 1b62dd3256 Revamped compiler type test specs 2015-08-30 01:45:51 +02:00
Yorick Peterse 001c57e0ad Tag XPath::Conversion's API as private 2015-08-30 01:26:40 +02:00
Yorick Peterse 31a574e7f8 Removed the XPath::Evaluator class 2015-08-30 01:26:03 +02:00
Yorick Peterse e4919d7c31 Use XPath::Compiler in XML::Querying 2015-08-30 01:22:33 +02:00
Yorick Peterse c6df73d031 Fix eq spec to not depend on at_xpath 2015-08-30 01:20:08 +02:00
Yorick Peterse 5a736aa25c Removed Compiler#node_literal 2015-08-28 17:00:21 +02:00
Yorick Peterse 4ad4b89860 Revamp compiler specs for "self"
This also includes a fix for node() so that it matches attributes.
2015-08-28 16:57:24 +02:00
Yorick Peterse 3a04b1da06 Root element spec for "preceding-sibling" 2015-08-28 16:50:34 +02:00
Yorick Peterse e8377b360a Revamp compiler "preceding" specs
This also includes some fixes to make this axis behave correctly when
evaluate relative to a document.
2015-08-28 16:49:59 +02:00
Yorick Peterse 6b2874c507 Revamped compiler "preceding-sibling" specs 2015-08-28 16:30:26 +02:00
Yorick Peterse 84a9315b24 Revamped compiler specs for "parent" 2015-08-28 16:22:49 +02:00
Yorick Peterse 07658dadb1 Added Attribute#parent 2015-08-28 16:22:42 +02:00
Yorick Peterse d0177633f8 Revamp namespace compiler specs 2015-08-28 15:58:42 +02:00
Yorick Peterse a1e7d2d07f Revamp compiler "following" specs 2015-08-28 15:53:57 +02:00
Yorick Peterse 824c897467 Revamp compiler specs for following-sibling 2015-08-28 15:48:03 +02:00
Yorick Peterse aa3fbcf522 Revamp descendant compiler specs 2015-08-28 15:29:09 +02:00
Yorick Peterse 70bea2071c Fixed ancestor-or-self relative to attributes
Per libxml behaviour this axis shouldn't match attributes when using
"ancestor-or-self::*".
2015-08-27 10:49:32 +02:00
Yorick Peterse d5aad9c1c9 Revamp descendant-or-self compiler specs 2015-08-27 10:34:25 +02:00
Yorick Peterse 8f341b40d6 Revamp child axis compiler specs 2015-08-27 09:30:53 +02:00
Yorick Peterse ed31b9f1d3 Revamp compiler specs for the attribute axis 2015-08-26 22:51:04 +02:00
Yorick Peterse 5e3b0a4023 Started updating Compiler for the new XPath AST
This also includes fixes for ancestor and ancestor-or-self so that these
axes can be used relative to documents and attributes.
2015-08-26 22:40:00 +02:00
Yorick Peterse 4fb7e2f6ce Revamped ancestor/ancestor-or-self axis specs
This makes it easier to get more natural spec descriptions without
having to write them entirely by hand.
2015-08-26 22:35:13 +02:00
Yorick Peterse 9899a419b7 Added Attribute#each_ancestor 2015-08-26 22:26:46 +02:00
Yorick Peterse 083d048e63 Remove (path) usage from the CSS parser
This updates the CSS parser to make it compatible with the XPath AST
changes introduced in commit 365a9e9fa9.
This also, finally, means I can get rid of some of the hacks that were
used for "+ foo" selectors and building (path) nodes.
2015-08-26 19:15:12 +02:00
Yorick Peterse 365a9e9fa9 Replace (path) nodes with nested nodes
This changes the XPath AST so that every segment in a path (e.g.
foo/bar) is parsed as a child node of the node that precedes it. For
example, take the following expression:

    foo/bar

This used to be parsed into the following AST:

    (path
      (axis "child" (test nil "foo"))
      (axis "child" (test nil "bar")))

This is now parsed into the following AST:

    (axis "child"
      (test nil "foo")
      (axis "child"
        (test nil "bar")))

This new AST is much easier to deal with in the XPath::Compiler class,
especially when trying to ensure that each segment operates on the
correct input.

This commit also fixes parsing of type tests with predicates, such as:

    comment()[10]

This used to throw a parser error.
2015-08-26 10:16:48 +02:00
Yorick Peterse 866044f94f Removed useless block passes in the XPath compiler 2015-08-22 14:28:20 +02:00
Yorick Peterse c5b30d1eae Refactor XPath compiler support for predicates
Handling of predicates is delegated to 3 different methods:

* on_predicate_direct: for predicates such as foo[bar] and foo[x < y].
* on_predicate_temporary: for predicates that use the last() function
  somewhere.
* on_predicate_index: for predicates that only contain a literal index,
  foo[10] being an example.

This enables the compiler to use more optimized code depending on the
type of predicate while still being able to support last() and
position().

The code is currently still quite rough on the edges, this will be taken
care of in following commits.
2015-08-20 01:01:30 +02:00