Yorick Peterse
d951a8cc87
Track XML C lexer state in C only.
...
Instead of storing "act" and "cs" as an instance variable they (along with some
other variables) are now stored in a struct. This struct is attached to a lexer
instance using the (crappy) Data_Get_Struct/Data_Wrap_Struct API.
2014-10-26 11:38:06 +01:00
Yorick Peterse
b304b8b077
Fixed descendant-or-self with a predicate.
...
Processing of this axis along with a predicate wouldn't quite work out. Even if
the predicate returned false the node would still be matched (which should not
be the case).
2014-10-23 01:12:10 +02:00
Yorick Peterse
47e4a3aa49
Added benchmark for descendant-or-self
2014-10-23 01:11:19 +02:00
Yorick Peterse
7ee7f25239
Support for parsing CSS axes.
2014-10-23 00:42:45 +02:00
Yorick Peterse
9955f61bcb
Renamed CSS axis tokens.
...
These have been renamed as following:
T_CHILD => T_GREATER
T_FOLLOWING => T_TILDE
T_FOLLOWING_DIRECT => T_PLUS
2014-10-21 23:25:11 +02:00
Yorick Peterse
823f2f1bad
Clean generated CSS lexer/parser files.
2014-10-21 23:21:23 +02:00
Yorick Peterse
851e7d6d0b
First pass of rewriting the CSS parser.
...
The new parser uses way less confusing rule names, is a bit more strict and in
general much less of a pain to deal with.
2014-10-21 23:19:31 +02:00
Yorick Peterse
e3de65a258
Lex whitespace preceding CSS axes separately.
...
Previously input such as "x > y" would result in the following token sequences:
T_IDENT, T_CHILD, T_IDENT
This commit changes this to the following:
T_IDENT, T_SPACE, T_CHILD, T_IDENT
This allows the parser to use T_SPACE as a terminal token, this in turn prevents
around 16 shift/reduce conflicts from arising.
This does mean that input such as " > y" or " x > y" is now invalid. This
however can be solved by simply _not_ adding leading/trailing whitespace to CSS
queries.
2014-10-21 23:18:46 +02:00
Yorick Peterse
e2b4f51e64
Updated part of the CSS axis specs.
2014-10-20 19:07:06 +02:00
Yorick Peterse
21c27bf48e
Surround class values with spaces.
...
When using a CSS class selector the resulting XPath string passed to contains()
should be surrounded by spaces.
2014-10-20 09:29:42 +02:00
Yorick Peterse
15ebdb7de4
Fixed parsing of CSS class selectors.
...
When a class selector is used it should be checked as one of the possible
values, not as _the_ only value (unlike ID selectors).
2014-10-20 00:45:41 +02:00
Yorick Peterse
174d33c597
Re-enabled parsing of CSS predicates.
2014-10-20 00:39:12 +02:00
Yorick Peterse
d4150fd0f5
First step at rewriting the CSS parser.
...
The new setup will not involve a separate transformation stage, instead the CSS
parser will directly emit an XPath AST. This reduces the overhead needed for
parsing/evaluating CSS selectors while also simplifying the code. The downside
is that I basically have to re-write 80% of the parser.
2014-10-20 00:30:16 +02:00
Yorick Peterse
ea2baa2020
Swap child node order for CSS pseudo classes.
2014-10-16 23:18:14 +02:00
Yorick Peterse
63d27fa709
Swap child order of CSS class and id nodes.
...
This makes it easier to transform the AST at a later stage.
2014-10-16 23:13:54 +02:00
Yorick Peterse
7ccd685acb
Use a helper method for transforming CSS ASTs.
2014-10-16 23:01:56 +02:00
Yorick Peterse
a85cd7cbd1
Trimmed CSS class transformer specs a bit.
2014-10-16 22:51:55 +02:00
Yorick Peterse
5fde2f9092
Basic tests for the CSS transformer.
2014-10-16 10:25:30 +02:00
Yorick Peterse
073e8fbe5b
Basic boilerplate for converting CSS to XPath.
2014-10-16 00:25:31 +02:00
Yorick Peterse
48eb4f83df
Lexing/parsing of CSS pseudos with ident arguments
...
This allows the lexing/parsing of expressions such as "html:lang(en)".
2014-10-15 09:42:26 +02:00
Yorick Peterse
d9a4221a0a
Remove :axis CSS node types.
...
The various axes are now simply their own node types.
2014-10-12 18:08:35 +02:00
Yorick Peterse
ed0cd7826e
Fixed precedence of ID/class CSS selectors
2014-10-07 23:05:34 +02:00
Yorick Peterse
91f9cc984b
Parsing of pseudo classes without node tests.
2014-10-07 23:01:58 +02:00
Yorick Peterse
a6b0bd96c8
Support for parsing CSS class/ID selectors.
2014-10-07 22:57:23 +02:00
Yorick Peterse
6792127600
Reworked CSS parser rules.
...
This includes better rules for parsing separate path members, pseudo class
arguments and some changes to remove all remaining parsing conflicts.
2014-10-07 22:47:47 +02:00
Yorick Peterse
b40c0243ce
Tighten up lexing of CSS predicates.
...
Operators can now only occur inside predicates and any whitespcae in these
predicates is ignored.
2014-10-07 22:17:04 +02:00
Yorick Peterse
625b9eeffd
Lexing of CSS axes with surrounding whitespace.
2014-10-07 22:06:45 +02:00
Yorick Peterse
619c0bbc14
Emit tokens for whitespace in the CSS lexer.
2014-10-07 21:55:41 +02:00
Yorick Peterse
6e18287a1d
Initial specs for parsing CSS IDs/classes.
2014-10-07 19:01:04 +02:00
Yorick Peterse
09315ea478
Test for operators inside CSS predicates.
2014-10-07 09:32:34 +02:00
Yorick Peterse
d960eb7cd5
Removed CSS lexer code that was commented out.
2014-10-07 09:29:11 +02:00
Yorick Peterse
16d66a7eb6
Better parsing for the nth-child pseudo class.
...
This uses stricter (and more correct) rules in both the lexer and the parser.
The resulting AST has also received a small rework to make it more compact and
less confusing.
2014-10-06 23:52:46 +02:00
Yorick Peterse
60da2bdd3a
Use RSpec.shared_example vs just shared_example.
2014-10-05 23:52:12 +02:00
Yorick Peterse
d0a8a3b18c
Basic support for parsing CSS pseudo classes.
...
This currently does not yet allow chained pseudo classes, nor does it allow for
pseudos such as nth-child(2n).
2014-10-05 23:46:41 +02:00
Yorick Peterse
e2b36ad9a4
Merge the CSS "expression" and "path" parser rules
2014-10-05 23:36:15 +02:00
Yorick Peterse
50ee66419e
Rename CSS "node operators" to "axes".
2014-10-05 23:33:46 +02:00
Yorick Peterse
197cb052be
Tighten up CSS predicate member rules.
...
CSS predicates can't contain full blown expressions, only attribute node tests
and operators.
2014-10-05 23:20:10 +02:00
Yorick Peterse
e1832adc97
Specs for parsing CSS node test wildcards.
2014-10-05 10:09:21 +02:00
Yorick Peterse
8fef62fca0
Support for parsing CSS operators.
2014-10-05 10:06:58 +02:00
Yorick Peterse
e03cd42735
Stricter lexing rules for XPath wildcards.
2014-10-05 09:57:25 +02:00
Yorick Peterse
2dd148539d
Parsing of CSS predicates.
...
This adds support for parsing expressions such as "foo[class]".
2014-10-05 09:32:21 +02:00
Yorick Peterse
665d5fe08c
Added basic specs for the CSS parser.
2014-10-05 01:28:31 +02:00
Yorick Peterse
773ff4ce45
Support for parsing multiple CSS node tests.
2014-10-05 01:28:19 +02:00
Yorick Peterse
b9a1f914bd
Basic CSS parser boilerplate.
...
This currently only parses single node tests (e.g. just "foo").
2014-10-02 23:32:07 +02:00
Yorick Peterse
4eea6d8359
Removed useless ivar in the XPath parser.
2014-10-02 22:52:39 +02:00
Yorick Peterse
cc3e752e1f
Removed custom AST::Node class.
...
Since this class did nothing other than extend AST::Node we might as well use
the latter.
2014-10-02 22:49:29 +02:00
Yorick Peterse
331d70e832
Corrected docs of the parse() helper method.
2014-10-02 22:41:22 +02:00
Yorick Peterse
73c5dbe636
Basic setup for lexing CSS pseudo selectors.
...
This includes support for the crazy 2n+1 syntax you can use with selectors such
as :nth-child().
CSS selectors: doing what XPath already does using an even crazier syntax,
because screw you.
2014-09-28 22:38:25 +02:00
Yorick Peterse
ea4a429430
Lexing of various CSS operators.
2014-09-28 22:38:25 +02:00
Yorick Peterse
059e797a42
Re-organized some of the CSS lexer tests.
2014-09-28 22:38:25 +02:00