Yorick Peterse
4fb7e2f6ce
Revamped ancestor/ancestor-or-self axis specs
...
This makes it easier to get more natural spec descriptions without
having to write them entirely by hand.
2015-08-26 22:35:13 +02:00
Yorick Peterse
a7744b7a5c
Use the XPath compiler for XPath/CSS specs
2015-08-19 20:14:20 +02:00
Yorick Peterse
afbb585812
Lexing support for unquoted HTML attribute values
...
This adds support for HTML such as:
<a href=foo>HTML is a child of Satan itself</a>
Fixes #94
2015-04-15 01:23:46 +02:00
Yorick Peterse
8acc7fc743
Lex CDATA tags in chunks
...
Instead of using a single token (T_CDATA) for a CDATA tag the lexer now
uses 3 tokens:
1. T_CDATA_START
2. T_CDATA_BODY
3. T_CDATA_END
The T_CDATA_BODY token can occur multiple times and is turned into a
single value in the XML parser. This is similar to the way strings are
lexed.
By changing the way CDATA tags are lexed Oga can now lex CDATA tags
containing newlines when using an IO as input. For example, this would
previously fail:
Oga.parse_xml(StringIO.new("<![CDATA[\nfoo]]>"))
Because IO input reads input per line the input for the lexer would be
as following:
"<![CDATA[\n"
"foo]]>"
Related issues: #93
2015-04-14 22:45:55 +02:00
Yorick Peterse
67d7d9af88
Added thread-safe LRU class
...
This class will be used for storing parser XPath/CSS ASTs.
See #71 for more information.
2015-03-23 00:21:52 +01:00
Yorick Peterse
45d84d31da
Renamed rspec helper files
2015-03-22 22:50:03 +01:00
Yorick Peterse
006ef4d51a
Port over most of the old XML error handling.
...
Some messages are a bit different due to ruby-ll's error handling, other than
that it's largely the same stuff as before.
2015-03-21 01:22:59 +01:00
Yorick Peterse
d5002010fe
Removed RSpec shared examples.
2014-11-10 00:06:26 +01:00
Yorick Peterse
3893e56ca8
Rewrote XPath evaluator paths spec.
...
This is the first spec of many that will be re-written. Eventually this will
remove the need of the shared examples as well as removing lots of code
duplication and odd context blocks.
2014-11-09 18:47:20 +01:00
Yorick Peterse
1f3b4cb2fb
Added initial CSS evaluation tests.
2014-11-04 23:34:01 +01:00
Yorick Peterse
d4150fd0f5
First step at rewriting the CSS parser.
...
The new setup will not involve a separate transformation stage, instead the CSS
parser will directly emit an XPath AST. This reduces the overhead needed for
parsing/evaluating CSS selectors while also simplifying the code. The downside
is that I basically have to re-write 80% of the parser.
2014-10-20 00:30:16 +02:00
Yorick Peterse
7ccd685acb
Use a helper method for transforming CSS ASTs.
2014-10-16 23:01:56 +02:00
Yorick Peterse
60da2bdd3a
Use RSpec.shared_example vs just shared_example.
2014-10-05 23:52:12 +02:00
Yorick Peterse
665d5fe08c
Added basic specs for the CSS parser.
2014-10-05 01:28:31 +02:00
Yorick Peterse
cc3e752e1f
Removed custom AST::Node class.
...
Since this class did nothing other than extend AST::Node we might as well use
the latter.
2014-10-02 22:49:29 +02:00
Yorick Peterse
331d70e832
Corrected docs of the parse() helper method.
2014-10-02 22:41:22 +02:00
Yorick Peterse
aa60115c0a
Basic boilerplate for lexing CSS selectors.
2014-09-28 22:38:24 +02:00
Yorick Peterse
a0ecba6321
Support for the XPath child axis.
2014-07-22 21:25:02 +02:00
Yorick Peterse
580856dcf7
Cleaned up XPath specs using a shared example.
2014-07-15 09:34:11 +02:00
Yorick Peterse
21a0f50457
Added groups for code coverage results.
2014-06-23 09:43:41 +02:00
Yorick Peterse
eba2d9954d
Support for parsing basic XPath expressions.
2014-06-12 00:20:46 +02:00
Yorick Peterse
8dd8d7a519
Basic working XPath lexer.
...
This doesn't lex everything of the XPath specification just yet and needs more
tests.
2014-06-01 19:24:35 +02:00
Yorick Peterse
8237d5791d
Stream tokens when lexing.
...
Instead of returning the tokens as a whole they are now streamed using
XML::Lexer#advance. This method returns the next token upon every call. It uses
a small buffer in case a particular block of text results in multiple tokens.
2014-04-09 22:08:13 +02:00
Yorick Peterse
cb74c7edf9
Specs for XML parser errors.
2014-04-07 21:31:36 +02:00
Yorick Peterse
79818eb349
Added a convenience class for parsing HTML.
...
This removes the need for users having to set the `:html` option themselves.
2014-03-25 09:40:24 +01:00
Yorick Peterse
eae13d21ed
Namespaced the lexer/parser under Oga::XML.
...
With the upcoming XPath and CSS selector lexers/parsers it will be confusing to
keep these in the root namespace.
2014-03-25 09:34:38 +01:00
Yorick Peterse
8d3f3f15d7
Renamed parse_html() to parse().
2014-03-16 23:46:20 +01:00
Yorick Peterse
cb75edc30d
Basic support for lexing/parsing HTML5.
...
This will need a bunch of extra tests before I'll consider closing #7 .
2014-03-16 23:42:24 +01:00
Yorick Peterse
8ce76be050
Moved the parser class to Oga::Parser.
...
Oga will use the same parser for XML and HTML so it doesn't make sense to
separate the two into different namespaces (at least for now).
2014-03-11 22:01:50 +01:00
Yorick Peterse
2c82f88f6c
Basic lexing + parsing of doctypes.
...
We're doing these the lazy way. I can't be bothered writing patterns/rules for
4 different formats for something such as doctypes.
2014-02-27 01:27:51 +01:00
Yorick Peterse
d32888f803
Basic lexer setup/tests.
...
Too lazy to do this the right way. ᕕ(ᐛ)ᕗ
2014-02-26 21:36:30 +01:00
Yorick Peterse
702477ca28
Basic project layout.
2014-02-26 19:50:16 +01:00