Compare commits

...

81 Commits
v2.4 ... master

Author SHA1 Message Date
rulingcom dbcf687fdd test 2024-02-15 14:17:27 +08:00
rulingcom 47238a8c7b Support older ruby 2.1. 2024-02-15 13:07:07 +08:00
Yorick Peterse c4efbcec1b
Release 3.4 2022-08-02 16:18:22 +02:00
lulalala 0a9d6302c3 Take ownership when children nodes are assigned 2022-08-02 21:17:33 +08:00
lulalala 36c11b2712 Update owner when assign children 2022-07-13 21:34:08 +08:00
Yorick Peterse cf27b764e8
Release 3.3 2020-07-27 02:20:04 +02:00
Roy Zwambag 2142243227 Add to_s as an alias to the to_xml method 2020-07-27 00:14:36 +00:00
Roy Zwambag 5bdd0207a0 Small fixes in CONTRIBUTING.md
- Change the links to to working links.
- Rename "pull request" to "merge request".
2020-07-25 21:41:28 +02:00
Yorick Peterse b7daee79de
Updated outdated parts of the CONTRIBUTING guide 2020-02-20 17:33:43 +01:00
Yorick Peterse 6736bcaeba
Release 3.2 2020-01-10 16:03:15 +01:00
Yorick Peterse 7f5c0dc8b0
Add CI config for Ruby 2.7 2020-01-10 15:41:37 +01:00
Kitaiti Makoto b9bcd21b2b Distinguish hash argument and keyword argument in arguments of XML::Querying#xpath 2020-01-10 20:18:48 +09:00
Kitaiti Makoto 804f755101 Pass keyword argument to XML::Querying#xpath implicitly 2020-01-10 20:17:16 +09:00
Yorick Peterse ac3fe8f343
Release 3.1 2020-01-08 13:57:23 +01:00
Yorick Peterse f00fa40e3a
Make PUBLIC/SYSTEM matching case-insensitive
Some websites may use "public" or "system" in doctypes, or completely
messed up casing such as PuBlIc (unlikely, but possible). This ensures
we don't care about the exact casing used.

This fixes https://gitlab.com/yorickpeterse/oga/issues/199
2020-01-08 03:23:46 +01:00
Kitaiti Makoto 10e9101c42 Write document about XML::Queryng#xpath's namespace argument 2019-12-03 20:13:32 +09:00
Yorick Peterse f4832339b2
Release 3.0 2019-12-03 02:06:12 +01:00
Yorick Peterse bf44e357e4
Drop support for Rubinius
Rubinius hasn't been tested on for years, nor is it really relevant as a
Ruby implementation these days. While Oga probably still works fine on
Rubinius, I don't want to claim we support something when we officially
don't.
2019-12-03 01:34:26 +01:00
Yorick Peterse 82373d164f
Bump MRI requirements to 2.3.0
See https://gitlab.com/yorickpeterse/oga/issues/196 for more
information.
2019-12-03 01:33:52 +01:00
Yorick Peterse e413165afd
Release 2.17 2019-12-02 19:32:43 +01:00
KitaitiMakoto 95da93949b Fix XPath queries using default XML namespace
This fixes https://gitlab.com/yorickpeterse/oga/issues/195
2019-12-02 17:40:11 +00:00
Yorick Peterse d492a775bf
Release 2.16 2019-11-29 15:43:39 +01:00
KitaitiMakoto 977bd594c8 Add support for XPath namespace aliases
This fixes https://gitlab.com/yorickpeterse/oga/issues/176
2019-11-29 14:21:45 +00:00
Kitaiti Makoto da9721cb34 Suppress deprecation warning on RDoc
> NOTE: Gem::Specification#has_rdoc= is deprecated with no replacement. It
> will be removed on or after 2018-12-01.
2019-11-05 16:12:31 +00:00
KitaitiMakoto d9e7346b60 Fix older option for gem command 2019-11-05 12:38:17 +00:00
Yorick Peterse e086515e59
Release 2.15 2018-04-11 21:42:08 +02:00
Yorick Peterse 8ac0055e42
Allow "th" to occur in thead, tbody, and tfoot
Fixes https://gitlab.com/yorickpeterse/oga/issues/190
2018-04-11 21:32:30 +02:00
Yorick Peterse 6c10f41446
Release 2.14 2018-01-30 22:58:43 +01:00
David Cornu bc87711f9c Return an Enumerator from each* methods when no block is given 2018-01-29 13:12:42 -05:00
Yorick Peterse 0b7b54119b
Release 2.13 2018-01-05 10:29:42 +01:00
Yorick Peterse 886a160c6a
Strip leading/trailing whitespace from CSS exprs
When tokenising CSS expressions we now strip leading and trailing
whitespace from the input string. This is performed without any checks
as a check + `String#strip` ended up being slower compared to just
running `String#strip`. On top of that we cache expressions anyway, so
the overhead of `String#strip` is very small.

Fixes https://gitlab.com/yorickpeterse/oga/issues/187
2018-01-04 22:02:50 +01:00
Yorick Peterse d1336e760a
Disable appveyor Email notifications 2018-01-02 13:23:22 +01:00
Yorick Peterse 952779da39
Use msys64 for installing Ragel 2018-01-02 13:14:55 +01:00
Yorick Peterse a5cb9887b0
Try adding vcpkg bin/ to the PATH 2018-01-02 13:06:06 +01:00
Yorick Peterse 5d693134d7
Try using vcpkg for installing Ragel 2018-01-02 12:44:07 +01:00
Vidur Murali 84748a8d85 Fix typo in gemspec homepage URL 2018-01-02 16:20:52 +05:30
Yorick Peterse e40bd38384
Fixed CHANGELOG typo 2017-12-29 20:42:06 +01:00
Yorick Peterse db00fcdd55
Release 2.12 2017-12-29 20:40:24 +01:00
Yorick Peterse f574197ea6
Ignore nested element start tags
This ensures that Oga is able to tokenize input such as the following:

    <script<script>foo</script>

Oga will now treat this as:

    <script>foo</script>

This is based on libxml behaviour, which seems to differ a bit from
Chromium which treats the node as a text node. This however would
require complex look-ahead logic (as far as I can tell) that I really
don't want to implement in Oga.

Fixes #186
2017-12-28 16:12:20 +01:00
Yorick Peterse 1e002de527
Update CI links
[ci skip]
2017-11-02 14:31:43 +01:00
Yorick Peterse e6eaff1a28
Fix YAML for JRuby builds 2017-11-02 14:13:18 +01:00
Yorick Peterse 11e83f911b
Try to install openjdk for JRuby
This is necessary so "javac" is available.
2017-11-02 14:07:51 +01:00
Yorick Peterse dea9bafee1
Use Ragel without a version
Apparently on some platforms (e.g. the JRuby build) 6.9 breaks things.
2017-11-02 13:24:13 +01:00
Yorick Peterse 9cfd628b55
Install build-base on Alpine
This is necessary to get programs such as "make".
2017-11-02 13:13:57 +01:00
Yorick Peterse 54e3115607
Bundle without nproc on CI
nproc is in coreutils which isn't installed by default. Since the
Gemfile is so small there's no real benefit to using -jX anyway.
2017-11-02 12:56:53 +01:00
Yorick Peterse 794291990e
Use ragel 6.9-r0
Maybe this will work, because apparently just using "ragel=6.9" refuses
to install.
2017-11-02 12:41:46 +01:00
Yorick Peterse de2166eb40
Don't use sudo for Alpine 2017-11-02 12:29:47 +01:00
Yorick Peterse b248cc7c0f
Updated Appveyor config 2017-11-02 12:21:02 +01:00
Yorick Peterse e811b511b0
Fixed the GitLab CI YAML 2017-11-02 02:39:40 +01:00
Yorick Peterse 5d1d7fd1d8
Move to GitLab and GitLab CI 2017-11-02 02:13:00 +01:00
Yorick Peterse f85869ecab
Release 2.11 2017-09-07 00:11:24 +02:00
Loic Nageleisen 39bf7ffaeb Silence method redefinition warnings
As the community progressively moves to a useful practice of enabling
ruby warnings on tests, knowingly redefining a method produces a
distracting warning that has to be special-cased when running automated
tests. We thus skip dynamic definitions of methods we know will be
redefined right after.
2017-09-07 00:03:36 +02:00
Loic Nageleisen 151788abad Silence uninitialized variable warnings
As the community progressively moves to a useful practice of enabling
ruby warnings on tests, assigning an instance variable before use becomes
a necessary practice. Here we set some variables at initialization that
were previously lazily or conditionally set:

- `decoded` is assigned false which seems to make more semantic sense
  than than using nil
- `namespace` is assigned nil, its value being lazily computed later
- `available_namespaces` is assigned nil so as to respect the cache
  invalidation mechanism
2017-09-07 00:03:36 +02:00
hinamiyagk ef1b8d2a28 Fix typo in README 2017-07-11 13:03:52 +02:00
Yorick Peterse d1f46e289c
Reworked the contributing guide a bit
A lot of the stuff in it was rather blunt.
2017-07-07 00:03:48 +02:00
Yorick Peterse 2710976e48
Added note about wanting more patches 2017-07-03 17:58:37 +02:00
Yorick Peterse b5848b07a9
Rename COC.md to make GitHub happy
GitHub only detects CODE_OF_CONDUCT.md.
2017-06-20 17:01:18 +02:00
Yorick Peterse 6f747656b6
Use RSpec 3 expect syntax for tests
This should make it a little bit easier for others to contribute.
2017-06-17 13:52:43 +02:00
Yorick Peterse 8282325569
Don't warn for implicit fallthroughs
This is the result of Ragel output which we can't control.
2017-06-17 13:46:59 +02:00
Yorick Peterse 8dc2318020
Release 2.10 2017-04-18 12:56:24 +02:00
Yorick Peterse 84c4db3e9f
Clean up changes from PR #174 2017-04-18 12:55:11 +02:00
PikachuEXE 4250033ed5 Update CHANGELOG
[ci skip]
2017-04-18 12:51:34 +02:00
PikachuEXE 21b5eeec4b Fix using symbol on Element#attribute alwas getting nil 2017-04-18 12:51:34 +02:00
Yorick Peterse e9953d4212
Release 2.9 2017-02-10 15:59:15 +01:00
Yorick Peterse 673f4a29db
Use HTML5 style closing tags for void elements
This ensures that element tags such as <img> tags don't use a closing />
when documents are parsed as HTML documents.

Fixes #170
2017-02-10 15:24:41 +01:00
Yorick Peterse 131fba7aed
Doctype inherits from Node
This makes it possible to parse documents where a doctype resides in a
node, instead of being located at the root.

Fixes #169
2017-02-10 15:10:30 +01:00
Yorick Peterse b13cfdfea5
Release 2.8 2017-01-04 12:28:37 +01:00
Yorick Peterse 5b4d295912
Nuke old Rubies from CI configuration 2017-01-04 00:12:10 +01:00
Po Shan Cheah c75ca96d22 Ruby 2.4 Fixnum deprecation
In Ruby 2.4, Fixnum is deprecated and the following produces a warning:

$ irb
irb(main):001:0> puts RUBY_VERSION
2.4.0
=> nil
irb(main):002:0> require 'oga'
=> true
irb(main):003:0> Oga.parse_xml('<people><person>Alice</person></people>').css(':nth-child(1)')
/Users/pcheah/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/oga-2.7/lib/oga/xpath/conversion.rb:79: warning: constant ::Fixnum is deprecated
=> NodeSet(Element(name: "people" children: NodeSet(Element(name: "person" children: NodeSet(Text("Alice"))))), Element(name: "person" children: NodeSet(Text("Alice"))))
irb(main):004:0>
$

So oga/xpath/conversion.rb needs to test for Integer instead of Fixnum.

Also, it appears that json 1.8.3 no longer builds under Ruby 2.4, so the gemfile has to be upgraded to json 2.0.
2017-01-03 21:38:04 +01:00
Yorick Peterse f5370c35d2
Release 2.7 2016-09-27 11:54:27 +02:00
Yorick Peterse e0e0687dc2
Generating closing element & Doctype XML
This commit fixes two problems:

1. Doctypes introducing too many newlines
2. Elements with siblings and a common parent not being closed properly

== Doctypes

When generating the XML for a doctype the XML::Generator class would
append a trailing newline. This however meant that if the next text node
was also a newline you'd now have two newlines. In previous versions of
Oga this worked because the old XML generation code would call
String#strip on the XML to add after the doctype.

To support this in the new version we perform a lookahead in
XML::Generator#on_doctype to remove any trailing newlines added by this
method in case the first child node is a newline text node.

== Closing Elements

When an element has a sibling following it _and_ does not have any child
nodes it would not be closed properly when generating XML. This is due
to the "until next_node = ..." expression evaluating to true, thus never
executing its body.

There's probably some way to work around this by using the "loop"
method, but considering it's 02:09 I think the current approach is good
enough. Future me will probably hate me for it.
2016-09-27 02:10:16 +02:00
Yorick Peterse 01fa1513f4
Lexing of processing instructions with namespaces
This adds lexing/parsing support for processing instructions that
contain namespace prefixes such as `<?foo:bar ?>`.
2016-09-17 14:51:48 +02:00
Yorick Peterse 116b9b0ceb
Make XmlDeclaration a ProcessingInstruction
This allows Oga to parse documents that contain an XML declaration at a
place other than at the document root. Oga still only assigns the XML
declaration to the document whenever it is at the top-level. This
matches libxml/XML specification behaviour as far as I can tell.
2016-09-17 14:39:07 +02:00
Scott Wheeler d40baf0c72 Add aliases for accessing attributes via [] and []=
This also fixes accessing attributes via symbol name and tests to ensure
that such does not break in the future.
2016-09-14 15:21:46 +03:00
Yorick Peterse b8fd8670df
Release 2.6 2016-09-10 02:50:02 +02:00
Yorick Peterse 38284278d5
Don't process siblings when reaching a root node
When generating XML we should not process the siblings of a root node.
Doing so results in invalid XML being returned (due to siblings not
being children of the root node).

Not processing the siblings in this case also prevents the siblings loop
from getting stuck. To explain what's happening, let's assume we're
using the following document tree:

    Document
      |_ Text
      |_ Element

Now let's say we take the Text node and call "to_xml" on it. When we
start the loop we'll run into the following code:

    if child_node = children && current.children[0]
      current = child_node
    else

Here the if statement will evaluate to false because a Text node doesn't
have any child nodes, as such we enter the else branch. We now reach the
following code:

    until next_node = current.is_a?(Node) && current.next

A Text node is a descendant of Node and it happens to have another node
(the Element node) as the next sibling. As a result we enter the `until`
loop's body. We now run into this code:

    if current.is_a?(Node) && current != @start
      current = current.parent
    end

Here `current` is still our Text node and it is the @start node. As a
result the `current` re-assignment won't be evaluated.

Next we run into the following:

    after_element(current, output) if current.is_a?(Element)

    break if current == @start

The first line will not evaluate because `current` is still the `Text`
node.  The `break` *will* evaluate because `current` is the same as
@start.

This will then lead to the following code being executed:

    current = next_node

Here `next_node` is the next sibling of the Text node, which in the
above example is the Element node.

Because all of the above runs in a `while` loop we'll at some point end
up again at the start of the `until` loop. At this point the `current`
variable contains an `Element`. Because this node does *not* have a node
following it we'll once again enter the `until` loop's body.

This loop will now get stuck because `current` is a Node, it's not the
same as @start, thus `current` is set to its parent (the Document),
which also isn't the same as @start.

On the next iteration this loop will break because `current` is no
longer a node. However, because a Document _does_ have child nodes the
whole process of traversing children/siblings will keep repeating itself
forever.

To work around this we now use the following statement:

    if child_node = children && current.children[0]
      ...
    elsif current == @start
      after_element(current, output) if current.is_a?(Element)

      break
    else
      until next_node = current.is_a?(Node) && current.next
      ...
    end

This prevents processing of any siblings once we have reached the root
node, in turn preventing the loop getting stuck forever.

I'm willing to bet there are probably a few more edge cases, but I can't
think of any others at the moment.

Fixes #161
2016-09-10 02:49:05 +02:00
Yorick Peterse 7aa34fd192
Removed max width from the YARD output
YARD apparently switched layout to something that doesn't like this, so
let's just get rid of it.
2016-09-06 22:42:57 +02:00
Yorick Peterse 7a8220ae78
Remove unnecessary use of Object#send 2016-09-06 22:37:30 +02:00
Yorick Peterse a6cd19933d
Fixed some YARD markup in XML::Generator 2016-09-06 22:32:10 +02:00
Yorick Peterse dd554f31e7
Release 2.5 2016-09-06 22:30:50 +02:00
Yorick Peterse 68f1f9f660
Relax parsing of XML doctypes
This allows the parser to parse doctypes that contain a mixture of
names, public IDs, inline rules, etc.

Fixes #159
2016-09-06 22:25:22 +02:00
299 changed files with 3168 additions and 2469 deletions

41
.gitlab-ci.yml Normal file
View File

@ -0,0 +1,41 @@
---
.defaults: &defaults
before_script:
- apk add --update ragel build-base
- if [ "$INSTALL_OPENJDK" == "true" ]; then apk add openjdk8; fi
- gem install bundler --no-document
- ruby --version
- gem --version
- bundle --version
- bundle install --path vendor --retry=3
script:
- bundle exec rake
cache:
paths:
- vendor/ruby
Ruby 2.3:
image: "ruby:2.3-alpine"
<<: *defaults
Ruby 2.4:
image: "ruby:2.4-alpine"
<<: *defaults
Ruby 2.5:
image: "ruby:2.5-alpine"
<<: *defaults
Ruby 2.6:
image: "ruby:2.6-alpine"
<<: *defaults
Ruby 2.7:
image: "ruby:2.7-alpine"
<<: *defaults
JRuby 9.1:
image: "jruby:9.1-alpine"
variables:
INSTALL_OPENJDK: "true"
<<: *defaults

View File

@ -1,54 +0,0 @@
---
language: ruby
script: bundle exec rake
sudo: false
addons:
apt:
packages:
- ragel
before_install:
- if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew update; fi
- if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew install ragel; fi
install:
- bundle install --retry=3
rvm:
- jruby
- 1.9.3
- 2.0.0
- 2.1
- 2.2
- 2.3.1
- rbx
matrix:
allow_failures:
- rvm: rbx
os: osx
exclude:
# Binaries for these rubies aren't available on OS X :<
- rvm: 2.2
os: osx
- rvm: 2.3.1
os: osx
- rvm: jruby
os: osx
fast_finish: true
notifications:
email:
recipients:
- yorickpeterse@gmail.com
on_success: change
on_failure: change
cache: bundler
os:
- linux
- osx

View File

@ -3,6 +3,165 @@
This document contains details of the various releases and their release dates. This document contains details of the various releases and their release dates.
Dates are in the format `yyyy-mm-dd`. Dates are in the format `yyyy-mm-dd`.
## 3.4 - 2022-08-02
This release includes a change that when setting the child nodes of a node A,
node A takes ownership over the entire new child tree. See merge request
https://gitlab.com/yorickpeterse/oga/-/merge_requests/194 for more details.
## 3.3 - 2020-07-27
This release adds `to_s` as an alias for `to_xml`, thanks to Roy Zwambag. See
merge request https://gitlab.com/yorickpeterse/oga/-/merge_requests/192 for more
information.
## 3.2 - 2020-01-10
This release fixes a few warnings that would show up when using Oga on Ruby
2.7.0. See https://gitlab.com/yorickpeterse/oga/merge_requests/190 for more
information.
## 3.1 - 2020-01-08
This release fixes a bug in the XML lexer that prevented the parsing of doctypes
using "public" or "system" instead of "PUBLIC"/"SYSTEM". See issue
<https://gitlab.com/yorickpeterse/oga/issues/199> for more information.
## 3.0 - 2019-12-03
This release bumps the Ruby version requirement to Ruby 2.3.0, as we haven't
supported older versions for several years now. We also no longer officially
support Rubinius.
## 2.17 - 2019-12-02
Elements using the default XML namespace can now be queried using XPath queries,
which was broken for quite a while.
See commit <https://gitlab.com/yorickpeterse/oga/commit/95da93949bf613612981f5cd7decc0d2c2a60e15>
for more information.
## 2.16 - 2019-11-29
* XPath namespace aliases can now be used when querying elements using XPath
expressions.
* Several RDOc and RubyGems deprecation warnings have been resolved.
See the following commits for more information:
* <https://gitlab.com/yorickpeterse/oga/commit/d9e7346b60c3afa2b3e83a240f9807c6bb819d48>
* <https://gitlab.com/yorickpeterse/oga/commit/da9721cb34f91527e72b096c6bd6a128e37b1992>
* <https://gitlab.com/yorickpeterse/oga/commit/977bd594c8bfd1a29aeba9d3a4ab7d0ebbc7d11a>
## 2.15 - 2018-04-11
The HTML parser now allows `th` elements to occur in `thead`, `tbody`, and
`tfoot` elements. See issue <https://gitlab.com/yorickpeterse/oga/issues/190>
for more information.
## 2.14 - 2018-01-30
Various methods that yield a block now return an Enumerator when no block is
provided. See merge request
<https://gitlab.com/yorickpeterse/oga/merge_requests/184> for more information.
## 2.13 - 2018-01-05
Leading and trailing whitespace is now removed from CSS selectors. See
<https://gitlab.com/yorickpeterse/oga/merge_requests/183> for more information.
## 2.12 - 2017-12-29
Element start tags containing other start tags (e.g. `<script<script>`) are now
parsed correctly.
See f574197ea657cf09405336ca618a22e32c94d0d0 for more information.
## 2.11 - 2017-09-07
Various Ruby warnings have been resolved by Loic Nageleisen. See pull request
<https://gitlab.com/yorickpeterse/oga/pull/180> for more information.
## 2.10 - 2017-04-18
### Fix `Element#attribute` for HTML documents when using Symbol arguments
You can now pass a Symbol to `Oga::XML::Element#attribute` for both XML and HTML
documents, previously this only worked for XML documents. See
[PR #174](https://gitlab.com/yorickpeterse/oga/pull/174) for more information.
## 2.9 - 2017-02-10
### Closing tags for HTML void elements
Certain HTML elements such as `<img>` and `<link>` (called "void elements" in
Oga) are now closed using a `>` tag instead of `/>`. In other words, instead of
outputting `<img src="..." />` Oga now outputs `<img src="...">`.
### Doctypes are now Nodes
Each Doctype now inherits from `Oga::XML::Node`. This makes it possible to parse
documents where a doctype is located in a child node. However, in these cases
Oga will _not_ populate `Oga::XML::Document#doctype` as this can not be done in
an efficient way.
## 2.8 - 2017-01-04
Ruby 2.4 deprecates Fixnum in favour of Integer, producing warnings whenever
Fixnum is used. Oga 2.8 contains a fix contributed by Po Shan Cheah to remove
these deprecation warnings. See commit c75ca96d229a50b369e16057622255a674f2cabc
for more information.
## 2.7 - 2016-09-27
### Closing Elements When Generating XML
When generating XML Oga now properly closes elements with siblings but without
children. See commit e0e0687dc29427c854c9fa6d3c19cee1c04f92c7 for more
information.
### Newlines After Doctypes
When generating XML a newline would be inserted after a doctype. If another
newline would follow in a text node this would lead to multiple newlines being
present. Oga now ensures there is only 1 newline following a doctype. See commit
e0e0687dc29427c854c9fa6d3c19cee1c04f92c7 for more information.
### Processing Instructions With Namespace Prefixes
The XML lexer now supports processing instructions containing namespace prefixes
such as `<?xml:foo ?>`. See commit 01fa1513f4bd6f194bf6d1ca17e510003fa23312 for
more information.
### XML Declarations Are Now Processing Instructions
The class `Oga::XML::XmlDeclaration` now extends
`Oga::XML::ProcessingInstruction`. This allows documents to contain XML
declarations in nested elements, instead of only allowing this at the root of
the document. See commit 116b9b0ceb6feab4daa0bb417302590fba948bef for more
information.
### Aliases For Getting & Setting Attributes
The methods `Oga::XML::Element#get` and `Oga::XML::Element#set` are now aliased
as `#[]` and `#[]=` respectively. See d40baf0c724a3874f43100fbefa775cfb8dcacda
for more information and thanks to Scott Wheeler for contributing the patch.
## 2.6 - 2016-09-10
This release fixes a bug in the XML generation code that would cause it to get
stuck in the generation loop. See issue
<https://gitlab.com/yorickpeterse/oga/issues/161> and commit
38284278d542640c3d8300ef15890af93b6df779 for more information.
## 2.5 - 2016-09-06
This release fixes a bug in the XML parser that would prevent it from parsing
doctypes that contain a mixture of public/system IDs, a name, and inline rules.
See issue <https://gitlab.com/yorickpeterse/oga/issues/159> and commit
68f1f9f660b90a43d22c8514e8cbf53f7ca0097d for more information.
## 2.4 - 2016-09-04 ## 2.4 - 2016-09-04
### Serialising Large Documents ### Serialising Large Documents
@ -11,7 +170,7 @@ Oga can now serialise large documents without causing the call stack to overflow
thanks to the new `Oga::XML::Generator` class. This class can generate XML thanks to the new `Oga::XML::Generator` class. This class can generate XML
without using a stack at all. without using a stack at all.
See issue <https://github.com/YorickPeterse/oga/issues/158> and commit See issue <https://gitlab.com/yorickpeterse/oga/issues/158> and commit
dd138981f68a606eff5d5a01e990f04398087dc4 for more information. dd138981f68a606eff5d5a01e990f04398087dc4 for more information.
### Faster retrieval of previous/next nodes ### Faster retrieval of previous/next nodes
@ -29,14 +188,14 @@ See commit 5a58b1413767fed4518e8a67c4eb432a31592660 for more information.
Thanks to various changes provided by Erik Michaels-Ober Oga can now be used to Thanks to various changes provided by Erik Michaels-Ober Oga can now be used to
parse XML input from a pipe (as returned by for example `IO.pipe`). See the parse XML input from a pipe (as returned by for example `IO.pipe`). See the
following pull request for more information: following pull request for more information:
<https://github.com/YorickPeterse/oga/pull/154>. <https://gitlab.com/yorickpeterse/oga/pull/154>.
## 2.2 - 2016-02-23 ## 2.2 - 2016-02-23
### XPath support for nested pipe operators ### XPath support for nested pipe operators
Nested pipe operators such as `a | b | c` are now supported as XPath Nested pipe operators such as `a | b | c` are now supported as XPath
expressions. See issue <https://github.com/YorickPeterse/oga/issues/149> and expressions. See issue <https://gitlab.com/yorickpeterse/oga/issues/149> and
commit 6d3c5c2ce93cbce337338bdc1a4971da72517038 for more information. commit 6d3c5c2ce93cbce337338bdc1a4971da72517038 for more information.
## 2.1 - 2016-02-09 ## 2.1 - 2016-02-09
@ -45,7 +204,7 @@ commit 6d3c5c2ce93cbce337338bdc1a4971da72517038 for more information.
Decoding of invalid XML/HTML entities now results in these entities being Decoding of invalid XML/HTML entities now results in these entities being
preserved as-is, instead of raising an EncodingError in certain places. See preserved as-is, instead of raising an EncodingError in certain places. See
<https://github.com/YorickPeterse/oga/issues/143> and commit <https://gitlab.com/yorickpeterse/oga/issues/143> and commit
5bfc2d50f2a3d387cb9fc28826d1f3d5a2d9d224 for more information. 5bfc2d50f2a3d387cb9fc28826d1f3d5a2d9d224 for more information.
### New Versioning Format ### New Versioning Format
@ -143,8 +302,8 @@ new compiler setup, how it works, how it performs, etc.
In the mean time, see the following issues/pull requests for more information: In the mean time, see the following issues/pull requests for more information:
* <https://github.com/YorickPeterse/oga/issues/102> * <https://gitlab.com/yorickpeterse/oga/issues/102>
* <https://github.com/YorickPeterse/oga/pull/138> * <https://gitlab.com/yorickpeterse/oga/pull/138>
### Escaping of characters in CSS expressions ### Escaping of characters in CSS expressions
@ -153,14 +312,14 @@ namespace. This can be done by escaping the dot using a backslash. For example:
Oga.parse_xml('<foo.bar />').css('foo\.bar') # => NodeSet(Element(name: "foo.bar")) Oga.parse_xml('<foo.bar />').css('foo\.bar') # => NodeSet(Element(name: "foo.bar"))
See issue <https://github.com/YorickPeterse/oga/issues/124> for more See issue <https://gitlab.com/yorickpeterse/oga/issues/124> for more
information. information.
### Support for the CSS :not() pseudo class ### Support for the CSS :not() pseudo class
CSS expressions can now use the `:not()` pseudo class. CSS expressions can now use the `:not()` pseudo class.
See issue <https://github.com/YorickPeterse/oga/issues/125> for more See issue <https://gitlab.com/yorickpeterse/oga/issues/125> for more
information. information.
### Improved parsing of CSS expressions ### Improved parsing of CSS expressions
@ -170,8 +329,8 @@ these would result in parser errors.
See the following issues for more information: See the following issues for more information:
* <https://github.com/YorickPeterse/oga/issues/126> * <https://gitlab.com/yorickpeterse/oga/issues/126>
* <https://github.com/YorickPeterse/oga/issues/131> * <https://gitlab.com/yorickpeterse/oga/issues/131>
### Unicode support for CSS/XPath ### Unicode support for CSS/XPath
@ -179,7 +338,7 @@ CSS and XPath expressions can now contain Unicode characters, previously only
ASCII characters were allowed for identifiers (node tests, attribute names, ASCII characters were allowed for identifiers (node tests, attribute names,
etc). etc).
See issue <https://github.com/YorickPeterse/oga/issues/140> for more See issue <https://gitlab.com/yorickpeterse/oga/issues/140> for more
information. information.
## 1.2.3 - 2015-08-19 ## 1.2.3 - 2015-08-19
@ -227,8 +386,8 @@ Jakub Pawlowicz improved the process of decoding XML/HTML entities so that it
handles unrecognized entities better. Previously Oga would raise an error when handles unrecognized entities better. Previously Oga would raise an error when
trying to decode entities such as `&#TAB;` instead of just leaving them as-is. trying to decode entities such as `&#TAB;` instead of just leaving them as-is.
See issue <https://github.com/YorickPeterse/oga/issues/118> and pull request See issue <https://gitlab.com/yorickpeterse/oga/issues/118> and pull request
<https://github.com/YorickPeterse/oga/pull/122> for more information. <https://gitlab.com/yorickpeterse/oga/pull/122> for more information.
## 1.2.0 - 2015-06-30 ## 1.2.0 - 2015-06-30
@ -287,7 +446,7 @@ replaced with a Text node). For example:
Thanks to Tero Tasanen for adding this. Thanks to Tero Tasanen for adding this.
See commit 0b4791b277abf492ae0feb1c467dfc03aef4f2ec and See commit 0b4791b277abf492ae0feb1c467dfc03aef4f2ec and
<https://github.com/YorickPeterse/oga/pull/116> for more information. <https://gitlab.com/yorickpeterse/oga/pull/116> for more information.
### Encoding quotes in attribute values ### Encoding quotes in attribute values
@ -442,8 +601,8 @@ See the following commits for more information:
The following issues are also worth checking out: The following issues are also worth checking out:
* https://github.com/YorickPeterse/oga/issues/101 * https://gitlab.com/yorickpeterse/oga/issues/101
* https://github.com/YorickPeterse/oga/issues/99 * https://gitlab.com/yorickpeterse/oga/issues/99
### Handling of invalid XML/HTML ### Handling of invalid XML/HTML
@ -520,7 +679,7 @@ And so is this:
<a href=foo/bar>Foo/bar</a> <a href=foo/bar>Foo/bar</a>
See Github issue <https://github.com/YorickPeterse/oga/issues/94> and the See GitLab issue <https://gitlab.com/yorickpeterse/oga/issues/94> and the
following commits for more information: following commits for more information:
* bc9b9bc9537d9dc614b47323e0a6727a4ec2dd04 * bc9b9bc9537d9dc614b47323e0a6727a4ec2dd04
@ -544,7 +703,7 @@ The XML lexer has been tweaked so it can handle multi-line CDATA tags, comments
and processing instructions, both when using a String and IO (or similar) as and processing instructions, both when using a String and IO (or similar) as
input. input.
See Github issue <https://github.com/YorickPeterse/oga/issues/93> and the See GitLab issue <https://gitlab.com/yorickpeterse/oga/issues/93> and the
following commits for more information: following commits for more information:
* b2ea20ba615953254554565e0c8b11587ac4f59c * b2ea20ba615953254554565e0c8b11587ac4f59c
@ -660,7 +819,7 @@ like the other callbacks.
### Parser rewritten using ruby-ll ### Parser rewritten using ruby-ll
The XML, CSS and XPath parsers have been re-written using ruby-ll The XML, CSS and XPath parsers have been re-written using ruby-ll
(<https://github.com/yorickpeterse/ruby-ll>). While Racc served its purpose (<https://gitlab.com/yorickpeterse/ruby-ll>). While Racc served its purpose
(until now) it has three main problems: (until now) it has three main problems:
1. Performance is not as good as it should be. 1. Performance is not as good as it should be.
@ -673,7 +832,7 @@ ruby-ll parsers. These parsers are LL(1) parsers which makes them a lot easier
to debug. Performance is currently a tiny bit faster than the old Racc parsers, to debug. Performance is currently a tiny bit faster than the old Racc parsers,
but this will be improved in the coming releases of both Oga and ruby-ll. but this will be improved in the coming releases of both Oga and ruby-ll.
See pull request <https://github.com/YorickPeterse/oga/pull/78> for more See pull request <https://gitlab.com/yorickpeterse/oga/pull/78> for more
information. information.
### Lazy decoding of XML/HTML entities ### Lazy decoding of XML/HTML entities
@ -719,7 +878,7 @@ documents _don't_ have their contents converted, ensuring proper Javascript
syntax upon output. syntax upon output.
See commit 874d7124af540f0bc78e6c586868bbffb4310c5d and issue See commit 874d7124af540f0bc78e6c586868bbffb4310c5d and issue
<https://github.com/YorickPeterse/oga/issues/79> for more information. <https://gitlab.com/yorickpeterse/oga/issues/79> for more information.
### Proper lexing support for script tags ### Proper lexing support for script tags
@ -727,7 +886,7 @@ When lexing HTML documents the XML lexer is now capable of lexing the contents
of `<script>` tags properly. Previously input such as `<script>x >y</script>` of `<script>` tags properly. Previously input such as `<script>x >y</script>`
would result in incorrect tokens being emitted. See commit would result in incorrect tokens being emitted. See commit
ba2177e2cfda958ea12c5b04dbf60907aaa8816d and issue ba2177e2cfda958ea12c5b04dbf60907aaa8816d and issue
<https://github.com/YorickPeterse/oga/issues/70> for more information. <https://gitlab.com/yorickpeterse/oga/issues/70> for more information.
### Element Inner Text ### Element Inner Text
@ -735,7 +894,7 @@ When setting the inner text of an element using `Oga::XML::Element#inner_text=`
_all_ child nodes of the element are now removed first, instead of only text _all_ child nodes of the element are now removed first, instead of only text
nodes being removed. nodes being removed.
See <https://github.com/YorickPeterse/oga/issues/64> for more information. See <https://gitlab.com/yorickpeterse/oga/issues/64> for more information.
### Support for extra XML entities ### Support for extra XML entities
@ -793,14 +952,14 @@ perhaps other libraries) the parser _does not_ output XPath expressions as a
String or a CSS specific AST. Instead it directly emits an XPath AST. This String or a CSS specific AST. Instead it directly emits an XPath AST. This
allows the resulting AST to be directly evaluated by `Oga::XPath::Evaluator`. allows the resulting AST to be directly evaluated by `Oga::XPath::Evaluator`.
See <https://github.com/YorickPeterse/oga/issues/11> for more information. See <https://gitlab.com/yorickpeterse/oga/issues/11> for more information.
### Mutli-line Attribute Support ### Mutli-line Attribute Support
Oga can now lex/parse elements that have attributes with newlines in them. Oga can now lex/parse elements that have attributes with newlines in them.
Previously this would trigger memory allocation errors. Previously this would trigger memory allocation errors.
See <https://github.com/YorickPeterse/oga/issues/58> for more information. See <https://gitlab.com/yorickpeterse/oga/issues/58> for more information.
### SAX after_element ### SAX after_element
@ -808,7 +967,7 @@ The `after_element` method in the SAX parsing API now always takes two
arguments: the namespace name and element name. Previously this method would arguments: the namespace name and element name. Previously this method would
always receive a single nil value as its argument, which is rather pointless. always receive a single nil value as its argument, which is rather pointless.
See <https://github.com/YorickPeterse/oga/issues/54> for more information. See <https://gitlab.com/yorickpeterse/oga/issues/54> for more information.
### XPath Grouping ### XPath Grouping
@ -828,7 +987,7 @@ This can be used to download and parse XML files on the fly. For example:
document = Oga.parse_xml(enum) document = Oga.parse_xml(enum)
See <https://github.com/YorickPeterse/oga/issues/48> for more information. See <https://gitlab.com/yorickpeterse/oga/issues/48> for more information.
### Removing Attributes ### Removing Attributes
@ -862,7 +1021,7 @@ the usage of the default `Object#==` method.
XML entities such as `&amp;` and `&lt;` are now encoded/decoded by the lexer, XML entities such as `&amp;` and `&lt;` are now encoded/decoded by the lexer,
string and text nodes. string and text nodes.
See <https://github.com/YorickPeterse/oga/issues/49> for more information. See <https://gitlab.com/yorickpeterse/oga/issues/49> for more information.
### General ### General
@ -885,7 +1044,7 @@ improved by removing String allocations that were not needed.
## 0.1.3 - 2014-09-24 ## 0.1.3 - 2014-09-24
This release fixes a problem with serializing attributes using the namespace This release fixes a problem with serializing attributes using the namespace
prefix "xmlns". See <https://github.com/YorickPeterse/oga/issues/47> for more prefix "xmlns". See <https://gitlab.com/yorickpeterse/oga/issues/47> for more
information. information.
## 0.1.2 - 2014-09-23 ## 0.1.2 - 2014-09-23

View File

@ -6,9 +6,9 @@ one should follow.
## Code of Conduct ## Code of Conduct
The code of conduct ("CoC") can be found in the file "COC.md". Everybody The code of conduct ("CoC") can be found in the file "CODE_OF_CONDUCT.md".
participating in this project must adhere to the rules and guidelines stated in Everybody participating in this project must adhere to the rules and guidelines
this CoC. stated in this CoC.
## General ## General
@ -20,7 +20,7 @@ this CoC.
## Submitting Changes ## Submitting Changes
Before making any big changes it's best to open a Github issue to discuss the Before making any big changes it's best to open a GitLab issue to discuss the
matter, this saves you from potentially spending hours on something that might matter, this saves you from potentially spending hours on something that might
ultimately be rejected. ultimately be rejected.
@ -28,7 +28,7 @@ When making changes please stick to the existing style and patterns as this
keeps the codebase consistent. If a certain pattern or style is getting in your keeps the codebase consistent. If a certain pattern or style is getting in your
way please open a separate issue about this so it can be discussed. way please open a separate issue about this so it can be discussed.
Every commit and every pull request made is carefully reviewed. Chances are I'll Every commit and every merge request made is carefully reviewed. Chances are I'll
spend more time reviewing it than the time an author spent on their changes. spend more time reviewing it than the time an author spent on their changes.
This should ensure that Oga's codebase is stable, of high quality and easy to This should ensure that Oga's codebase is stable, of high quality and easy to
maintain. As such _please_ take my feedback into consideration (or discuss it in maintain. As such _please_ take my feedback into consideration (or discuss it in
@ -36,18 +36,18 @@ a civilized manner) instead of just dismissing it with comments such as "But I
fixed the problem so your feedback is irrelevant" or "This is my way of doing fixed the problem so your feedback is irrelevant" or "This is my way of doing
things". things".
Finally, and this will sound harsh: I will _not_ merge pull requests if the Finally, and this will sound harsh: I will _not_ merge merge requests if the
author(s) simply disregard the feedback I've given them or if there are other author(s) simply disregard the feedback I've given them or if there are other
problems with the pull request. Do not expect me to just blindly accept whatever problems with the merge request. Do not expect me to just blindly accept whatever
changes are submitted. changes are submitted.
Some examples of good pull request: Some examples of good merge requests:
* https://github.com/YorickPeterse/oga/pull/96 * https://gitlab.com/yorickpeterse/oga/-/merge_requests/96
* https://github.com/YorickPeterse/oga/pull/67 * https://gitlab.com/yorickpeterse/oga/-/merge_requests/67
* https://github.com/YorickPeterse/ffi-aspell/pull/21 * https://gitlab.com/yorickpeterse/ffi-aspell/-/merge_requests/21
* https://github.com/YorickPeterse/ffi-aspell/pull/20 * https://gitlab.com/yorickpeterse/ffi-aspell/-/merge_requests/20
* https://github.com/YorickPeterse/ruby-ll/pull/16 * https://gitlab.com/yorickpeterse/ruby-ll/-/merge_requests/16
## Git ## Git
@ -75,36 +75,29 @@ Use spaces for indentation, tabs are not accepted. The usage of spaces ensures
the indentation is identical no matter what program or system is used to view the indentation is identical no matter what program or system is used to view
the source code. the source code.
Hard wrap lines at 80 characters per line. Most modern editors can easily handle Hard wrap lines at roughly 80 characters per line. Most modern editors can
this, if not you should get a better editor. For example, in Vim you can select easily handle this. For example, in Vim you can select text in visual mode
text in visual mode (using `v`) and press `gq` to automatically re-wrap the (using `v`) and press `gq` to automatically re-wrap the selected text.
selected text.
It's OK if a line is a few characters longer than 80 but _please_ keep it as It's OK if a line is a few characters longer than 80 but _please_ keep it as
close to 80 characters as possible. Typically I do this when wrapping the line close to 80 characters as possible. Typically I do this when wrapping the line
results in several extra lines without it being much more readable. results in several extra lines without it being much more readable.
I often have multiple windows vertically next to each other and 80 characters
per line is the only setup that lets me do so, even on smaller screen
resolutions. For example, my typical setup is 1 file browser and two vertical
windows. Using 80 characters per line ensures all code fits in that space along
with some slight padding to make reading more pleasant.
To make this process easier Oga comes with an [EditorConfig][editorconfig] To make this process easier Oga comes with an [EditorConfig][editorconfig]
configuration file. If your editor supports this it will automatically apply configuration file. If your editor supports this it will automatically apply
the required settings for you. various settings for you.
## Hacking on Oga ## Hacking on Oga
Before you start hacking on Oga make sure the following libraries/tools are Before you start hacking on Oga make sure the following libraries/tools are
installed: installed:
* Ragel 6.x (6.9 recommended) * Ragel 6.x (6.10 recommended), Ragel 7.x is not supported
* gunzip (to unpack the fixtures) * gunzip (to unpack the fixtures)
* javac (only when using JRuby) * javac (only when using JRuby)
Assuming you have the above tools installed and a local Git clone of Oga, lets Assuming you have the above tools installed and a local Git clone of Oga, first
install the required Gems: you'll need to install the required Gems:
bundle install bundle install
@ -112,7 +105,8 @@ Next up, compile the required files and run the tests:
rake rake
You can compile the various parsers/extensions by running: If you just want to generate various files (e.g. the C extension), run the
following instead:
rake generate rake generate
@ -128,19 +122,10 @@ benchmark is just a matter of running a Ruby script, for example:
## Tests ## Tests
Tests are written using RSpec and use the "should" syntax instead of the Tests are written using RSpec and use the "expect" syntax. Specification blocks
"expect" syntax (for as long as RSpec keeps supporting this). This means that should be written using `it`, grouping should be done using `describe`.
assertions are written as following: Specification descriptions should be meaningful and human-friendly English. For
example:
some_object.should == some_value
instead of this:
expect(some_object).to eq(some_value)
Specification blocks should be written using `it`, grouping should be done using
`describe`. Specification descriptions should be meaningful and human friendly
English. For example:
describe Oga::XML::Entities do describe Oga::XML::Entities do
describe 'decode' do describe 'decode' do
@ -154,40 +139,15 @@ Typically the top-level `describe` block is used to describe a method name. In
such a case use `describe 'foo'` for class methods and `describe '#foo'` for such a case use `describe 'foo'` for class methods and `describe '#foo'` for
instance methods. instance methods.
Do not use `let` for creating data re-used between specifications, instead use Whenever adding new specifications please keep them in the existing style. If
a `before` block that sets an instance variable. In other words, use this: the style is problematic you can open a separate merge request to address it. If
you expect this to be a lot of work you should open an issue first to discuss
before do things.
@number = 10
end
instead of this:
let(:number) { 10 }
Instance variables stand out much more and they don't require one to also
understand what exactly `let` does which in turn simplifies the process of
reading and writing specifications.
Whenever adding new specifications please keep them in the existing style unless
I indicate otherwise. There's nothing more annoying than inconsistent
specifications.
If you insist on changing the structure/style of specifications please open an
issue and ask about it _before_ making any changes. I am very picky about how I
want things and it would be a shame for somebody to spend hours working on
something that isn't going to be merged in any way.
## Continuous Integration ## Continuous Integration
Two continuous integration services are used to ensure the tests of Oga pass Oga is tested using GitLab CI. Merge requests require that all tests pass before
at all times: they can be merged.
* Travis CI: <https://travis-ci.org/YorickPeterse/oga>
* AppVeyor (Windows): <https://ci.appveyor.com/project/YorickPeterse/oga>
Please note that I will not accept patches that break any tests unless stated
otherwise.
## Extension Setup ## Extension Setup
@ -254,7 +214,7 @@ modify `$LOAD_PATH`, instead run any scripts using `ruby -I lib`.
In case you have any further questions or would like to receive feedback before In case you have any further questions or would like to receive feedback before
submitting a change, feel free to contact me. You can either open an issue, submitting a change, feel free to contact me. You can either open an issue,
send a tweet to [@yorickpeterse][twitter] or send an Email to send a tweet to [@yorickpeterse][twitter] or send an Email to
<yorickpeterse@gmail.com>. <yorick@yorickpeterse.com>.
[editorconfig]:http://editorconfig.org/ [editorconfig]:http://editorconfig.org/
[twitter]: https://twitter.com/yorickpeterse [twitter]: https://twitter.com/yorickpeterse

View File

@ -2,10 +2,6 @@ source 'https://rubygems.org'
gemspec gemspec
platforms :mingw_19, :ruby_19 do
gem 'json', '~> 1.8'
end
group :benchmarking do group :benchmarking do
gem 'ox', :platforms => [:mri, :rbx] gem 'ox', :platforms => [:mri, :rbx]
gem 'nokogiri' gem 'nokogiri'

View File

@ -1,5 +1,9 @@
# Oga # Oga
**NOTE:** my spare time is limited which means I am unable to dedicate a lot of
time on Oga. If you're interested in contributing to FOSS, please take a look at
the open issues and submit a pull request to address them where possible.
Oga is an XML/HTML parser written in Ruby. It provides an easy to use API for Oga is an XML/HTML parser written in Ruby. It provides an easy to use API for
parsing, modifying and querying documents (using XPath expressions). Oga does parsing, modifying and querying documents (using XPath expressions). Oga does
not require system libraries such as libxml, making it easier and faster to not require system libraries such as libxml, making it easier and faster to
@ -169,8 +173,8 @@ Querying a document using a namespace:
| Ruby | Required | Recommended | | Ruby | Required | Recommended |
|:---------|:--------------|:------------| |:---------|:--------------|:------------|
| MRI | >= 1.9.3 | >= 2.1.2 | | MRI | >= 1.9.3 | >= 2.1.2 |
| Rubinius | >= 2.2 | >= 2.2.10 |
| JRuby | >= 1.7 | >= 1.7.12 | | JRuby | >= 1.7 | >= 1.7.12 |
| Rubinius | Not supported | |
| Maglev | Not supported | | | Maglev | Not supported | |
| Topaz | Not supported | | | Topaz | Not supported | |
| mruby | Not supported | | | mruby | Not supported | |
@ -223,15 +227,14 @@ And if you want to specify an explicit namespace URI, you can use this:
descendant::*[local-name() = "bar" and namespace-uri() = "http://example.com"] descendant::*[local-name() = "bar" and namespace-uri() = "http://example.com"]
Unlike Nokogiri, Oga does _not_ provide a way to create "dynamic" namespaces. Like Nokogiri, Oga provides a way to create "dynamic" namespaces.
That is, Nokogiri allows one to query the above document as following: That is, Oga allows one to query the above document as following:
document = Nokogiri::XML('<root xmlns="http://example.com"><bar>bar</bar></root>') document = Oga.parse_xml('<root xmlns="http://example.com"><bar>bar</bar></root>')
document.xpath('x:root/x:bar', :x => 'http://example.com') document.xpath('x:root/x:bar', namespaces: {'x' => 'http://example.com'})
Oga does have a small trick you can use to cut down the size of your XPath Moreover, because Oga assigns the name "xmlns" to default namespaces you can use
queries. Because Oga assigns the name "xmlns" to default namespaces you can use
this in your XPath queries: this in your XPath queries:
document = Oga.parse_xml('<root xmlns="http://example.com"><bar>bar</bar></root>') document = Oga.parse_xml('<root xmlns="http://example.com"><bar>bar</bar></root>')
@ -242,9 +245,6 @@ When using this you can still restrict the query to the correct namespace URI:
document.xpath('xmlns:root[namespace-uri() = "http://example.com"]/xmlns:bar') document.xpath('xmlns:root[namespace-uri() = "http://example.com"]/xmlns:bar')
In the future I might add an API to ease this process, although at this time I
have little interest in providing an API similar to Nokogiri.
## HTML5 Support ## HTML5 Support
Oga fully supports HTML5 including the omission of certain tags. For example, Oga fully supports HTML5 including the omission of certain tags. For example,
@ -266,7 +266,7 @@ well as complicating the parsing internals of Oga. As a result I have decided
that Oga _does not_ insert these tags when left out. that Oga _does not_ insert these tags when left out.
A more in depth explanation can be found here: A more in depth explanation can be found here:
<https://github.com/YorickPeterse/oga/issues/98#issuecomment-96833066>. <https://gitlab.com/yorickpeterse/oga/issues/98#note_45443992>
## Documentation ## Documentation
@ -287,7 +287,7 @@ Currently there are a few existing parser out there, the most famous one being
The sad truth is that these existing libraries are problematic in their own The sad truth is that these existing libraries are problematic in their own
ways. Nokogiri for example is extremely unstable on Rubinius. On MRI it works ways. Nokogiri for example is extremely unstable on Rubinius. On MRI it works
because of the non conccurent nature of MRI, on JRuby it works because it's because of the non concurrent nature of MRI, on JRuby it works because it's
implemented as Java. Nokogiri also uses libxml2 which is a massive beast of a implemented as Java. Nokogiri also uses libxml2 which is a massive beast of a
library, is not thread-safe and problematic to install on certain platforms library, is not thread-safe and problematic to install on certain platforms
(apparently). I don't want to compile libxml2 every time I install Nokogiri (apparently). I don't want to compile libxml2 every time I install Nokogiri

View File

@ -1,38 +1,32 @@
--- ---
image: Visual Studio 2017
version: "{build}" version: "{build}"
install: install:
# Binary taken from http://w858rkbfg.homepage.t-online.de/index.php/software/ragel-windows/, - C:\msys64\usr\bin\bash -lc "pacman --noconfirm -S mingw-w64-x86_64-ragel"
# rehosted on AWS so it doesn't randomly vanish. - SET PATH=C:\msys64\mingw64\bin;%PATH%
- appveyor DownloadFile http://downloads.yorickpeterse.com/files/ragel-68-visualstudio2012.7z -FileName C:\ragel.7z
- 7z e C:\ragel.7z -oC:\ragel -y > nul
- SET PATH=C:\Ruby%ruby_version%\bin;%PATH% - SET PATH=C:\Ruby%ruby_version%\bin;%PATH%
- SET PATH=C:\ragel;%PATH%
- ruby --version
- gem --version
- appveyor DownloadFile https://rubygems.org/downloads/bundler-1.9.0.gem -FileName bundler-1.9.0.gem
- gem install bundler-1.9.0.gem --local --quiet --no-ri --no-rdoc
- bundle install --retry 3 - bundle install --retry 3
build: off build: off
before_test:
- ragel --version
- ruby --version
- gem --version
- bundle --version
test_script: test_script:
- rake - rake
environment: environment:
matrix: matrix:
- ruby_version: "193"
- ruby_version: "200"
- ruby_version: "21"
- ruby_version: "21-x64"
- ruby_version: "22" - ruby_version: "22"
- ruby_version: "22-x64" - ruby_version: "22-x64"
- ruby_version: "23"
- ruby_version: "23-x64"
- ruby_version: "24"
- ruby_version: "24-x64"
skip_tags: true skip_tags: true
notifications:
-
provider: Email
on_build_success: false
on_build_failure: false
on_build_status_changed: true

View File

@ -0,0 +1 @@
bac38b7526a1c7460175933942f9c6e7655807b5250c7032b2be8633f1d616a9285084a57e1ee3a4b6a05d15ae1490b09313a827f85fd8ebcb46a7c65d2175c0

View File

@ -0,0 +1 @@
84f6fe917bf8e335b391f0817cab9fb07b8c9ad2c3618ce530b3443d0c6e8e31f47cccc039f376a1b3d329a5cc0d3a84664f3e72a7c85b31b5b02b6db5c8360f

View File

@ -0,0 +1 @@
e89ef454241cad99049fa3767eb1c76ab7e1e10a0943e6448ffc7cfc70a251f7a72a3cd61f0664c35a54791752960b6432ddd857d56d11594b6272bff44d405b

View File

@ -0,0 +1 @@
d6d3f1f133c9f93bd9cc36b7748c5b8a223d53daacb0ed0c9784fb0aac08d483c558208ddda1fc6b7b544a638a4cd89f791d397c712f622b7096d0b99ab7ec79

View File

@ -0,0 +1 @@
5389228f4c92851e74396e507c37b0fbe8659522fad0932954a710640268cd51926185b0d4c7887b6d5c13f11af0dbde45fb43d0dd1adce5d8688f050d2de1ed

View File

@ -0,0 +1 @@
ab7a64c63bf2f03ecfe42f45199c943d60240eed0ee57f71e1d05fe8d29264c942f12bba8c34d3e3c22f4cb502611d3c65bda17a8c4afec9a593599537c143b6

View File

@ -0,0 +1 @@
a5492d3bfb21fad9060412e4189b2f08c179840eae5bacee2580795e5ce6bf9aa10d7e95c1e7729259c708f25051cf4df7d48261a05fbda8c39c5879656135c9

View File

@ -0,0 +1 @@
5acda689eff5c8f6ac03f97189f9906305576ab213276f317e5136476751905a0748be7058eca808d80bbdbe50475024c40400c135946c56fd2e0b958a8b4037

View File

@ -0,0 +1 @@
565cc42a3d776ead3bae1423af8b24279ef53f4ebdee7812bd2c842cf5660f0f6a8ac0054ed9dee37493f42335e0ab529c5aae965e218e49201c9408358ddb4c

View File

@ -0,0 +1 @@
05cb13515c8b4e99b0c59fb7d362b7245ad97eb4a9243c6a4c9eb8c89bd2a1db19b6951a7cd6e7484493df31aa49b8e0ce64c1aa15ecd3dafd4aa6fc5578918b

View File

@ -0,0 +1 @@
beaa3a5d8c5b64288f83f141daece073b3601ff631dc33a4a7d268fcf38b5a667c44c3ed67964baa6e250501b3b965df3b6ecfeee9a407a063981e91474baef9

View File

@ -0,0 +1 @@
b5071a0cd5d2d1bf79577ed1235ac9233de5cbbffc5c47a5ba375152dbf11fc6f1c8c66253d986288817862915df350aad7a7760ec8c536b479c9d5850a34cbf

View File

@ -0,0 +1 @@
241e4861fb8cdb8576b72672a2ad1d59e0f72333eb203d19b8922e15091a3470d0150d417f78d2394e2c9140fa7c9d87508acc51907537f57813b8c23272922e

View File

@ -0,0 +1 @@
5a2abc35e0696adf408f1d517865e49d511b26e39c0fe6a1f299baf77563327661498f3e1d70e20feb118810eb6457649706dc4fc3e8c45868d4b3d0ef56bfc8

View File

@ -0,0 +1 @@
4e35c653ef64ebfbbee7a933923e9cf53e988028e53cb1535127962d249501881ce35e5c7375b98e43e220b7561961a9d35fe15caf20b263b20367660a59b3eb

View File

@ -0,0 +1 @@
9cbb14e1abea3ebec3b7e9051bff5cae466cc4e608df6aa7826add38bcdb5b406cc8090405e63128a6902b24a64082ef5b9d1a36970c399ef4c941f63f2ee305

View File

@ -0,0 +1 @@
885d4a6155a93d50bff733c778886feb1a514ccf4bd937eabd022439622c75b83b00bd9994c974c9733e97f38b6dd02d882dc994868ca7bb90f03133cad70255

View File

@ -0,0 +1 @@
c129448eeabb128fd223833197efeb057fb03ccd33434c897bb5e27ceceee0a32cfd6f8abe2a7ab8512c0edadad99cedc994332262cf569a623eea9fc0287473

View File

@ -0,0 +1 @@
b107dec6aff28640a307ca02661abd5a29a5f0c369422c116bad7a198b11eb9aa9cab90bfe874a0c9fd4391343d4b5589fcbf5b5a0ff557ab520fd83725de9e8

View File

@ -0,0 +1 @@
8c888218a3da5a639df19e042a3e66bea2d5eec5c660bb4f13982eaff0626f80ab2840d0f58bba2a1fe5dce12563b5f73a9f7b15f23b05ff5c87fe4b7f064621

View File

@ -0,0 +1 @@
9bfe28b9b7b971fd80f405ad12c7c58cc94152b85ed29d4dc1e50525d220e1bc0a42fa76083fc0637a8c799f31c2c3599664df68af3dd2ed2c8ed3ae20b5b6a2

View File

@ -0,0 +1 @@
1a00cd24864ce2777a02f2c1964ca636654d860e61a0be7437bdcc8bf1afe393138ced1913117eb58ac10bd25f306fda16e0ae2f737486414522102642ac03e5

View File

@ -0,0 +1 @@
e948ef42c95ca8c0a8b1fe7f765c75b2be048ccd154cdafd49610e7c80c44a7592152bdc02cb37f466b62be99299b20879392518c98d4f94a0bad121d5844c44

View File

@ -0,0 +1 @@
59f3805163a97f3b06c13c9c46691bb34c106a4cdac3d04f57aaf93c64de74016b9485eff81d9461f21162887d1f114e6fa22bcfc0c36eb57d4ca93716ff001f

View File

@ -0,0 +1 @@
2839f9bf005a0875d87d0b6796aab4c4e6591ff1a02e8032ce9bc06d340db9fde3ac32d79f8ae19c9fd7cac8a4b3c50a9a5840b72e7912f205dfd12320466c41

View File

@ -0,0 +1 @@
c60b30f7b83b21916ed54ce37d5bed734f7e5807576ce2fb94e5ac577a479fa5a9f19594d44587b635727c6c9f88d0fc75c945d22f32342df6e159878b2c950d

View File

@ -0,0 +1 @@
b64906a38edefb346c2ba9770336cb69f424e0776690932fea524f014dda00d8fe1b13b69fff1f01ca75c8f5107b92056c25f8ae7ee4aeb83770dc03b5d482c0

View File

@ -0,0 +1 @@
56eea6f76968afb2916e73d729a9c94dcacfb1cccc6fa0ef27888e6e8006a80cda9279db4b040be81b33ee354916e49e437ce0189ce79bdbab7c0f54203b9f2e

View File

@ -0,0 +1 @@
a2246547f87d1901e280d9df915bf41a6b78ac14805c5c0f471f5dd1cf617f1e5b3e4aa05e58ae5aa816e188456ccd3638e44a7aba41ed2dfb942f509b2093af

View File

@ -0,0 +1 @@
deb03862d5263b2cb47169267aa37c41be82fcb01d048d840506d2c8924fb0bba6a4407401052e8a5f42d053c2bb3410701316fe21570b35a2981850dc05a481

View File

@ -0,0 +1 @@
b0c3740f08d33f5b9a76c6532de749b8004fb591e3bd3b745e8c57a3bbc5b3d6be4c9a02432fc32f4ae3ca53b488dc409243af91e43646ba3f486fec4738911e

View File

@ -0,0 +1 @@
27f941862134b9e5fc46b33d8dd642a7c816f8b0bb6448789c5f23e9da42abdc72b19713985cf81257074931a4f51e014aa551e09bcb7858cde9987ea17aca75

View File

@ -0,0 +1 @@
c5521d5bc9e025fadfb4e0719c8ff0fa103dd7c184cdf4c60154a1bb5f7d71c9d807a84e76a21b13665de1bbe54f9ba23f6e479650b1bd497302f86ff2af8bbf

View File

@ -0,0 +1 @@
be44f4fb2f5f821306556b965a928e42753a57e489516654bdda74662058510cdc9885b50f6170f23762309f3a7c94791d8db71180272961341c917ffc3560e4

View File

@ -0,0 +1 @@
eef134163a86451be4a5ec72b262fec6a1dad10613e0d4002142b09e02cb444cc25ce018cdd62a870b266fc8dd390ba4fe110e07a2c41f50a3d8abdcc69b5dec

View File

@ -0,0 +1 @@
2ba0fdbfa3fa15b8d1ce5df4df4cfb3813f34399c93517f98ec8da1e82ff3cdc6e3543bf017b8246daa8b2521a64af92a9438a404b69da0f319272c510961314

View File

@ -2,8 +2,6 @@ body
{ {
font-size: 14px; font-size: 14px;
line-height: 1.6; line-height: 1.6;
margin: 0 auto;
max-width: 960px;
} }
p code, dd code, li code p code, dd code, li code

View File

@ -1,7 +1,7 @@
require 'mkmf' require 'mkmf'
if RbConfig::CONFIG['CC'] =~ /clang|gcc/ if RbConfig::CONFIG['CC'] =~ /clang|gcc/
$CFLAGS << ' -pedantic' $CFLAGS << ' -pedantic -Wno-implicit-fallthrough'
end end
if ENV['DEBUG'] if ENV['DEBUG']

View File

@ -161,7 +161,7 @@
# instruction. # instruction.
# #
proc_ins_start = '<?' identifier; proc_ins_start = '<?' identifier (':' identifier)?;
proc_ins_end = '?>'; proc_ins_end = '?>';
# Everything except "?" OR a single "?" # Everything except "?" OR a single "?"
@ -289,7 +289,7 @@
# Machine for processing doctypes. Doctype values such as the public # Machine for processing doctypes. Doctype values such as the public
# and system IDs are treated as T_STRING tokens. # and system IDs are treated as T_STRING tokens.
doctype := |* doctype := |*
'PUBLIC' | 'SYSTEM' => { 'PUBLIC'i | 'SYSTEM'i => {
callback(id_on_doctype_type, data, encoding, ts, te); callback(id_on_doctype_type, data, encoding, ts, te);
}; };
@ -389,6 +389,7 @@
element_start = '<' ident_char; element_start = '<' ident_char;
element_end = '</'; element_end = '</';
element_start_pattern = '<' identifier (':' identifier)?;
# Machine used for lexing the name/namespace of an element. # Machine used for lexing the name/namespace of an element.
element_name := |* element_name := |*
@ -551,6 +552,7 @@
# Machine used for processing the contents of an XML element's starting tag. # Machine used for processing the contents of an XML element's starting tag.
element_head := |* element_head := |*
newline => advance_newline; newline => advance_newline;
element_start_pattern;
# Attribute names and namespaces. # Attribute names and namespaces.
identifier ':' => { identifier ':' => {
@ -578,6 +580,7 @@
# tag. # tag.
html_element_head := |* html_element_head := |*
newline => advance_newline; newline => advance_newline;
element_start_pattern;
html_identifier => { html_identifier => {
callback(id_on_attribute, data, encoding, ts, te); callback(id_on_attribute, data, encoding, ts, te);

View File

@ -35,8 +35,8 @@ require 'oga/xml/character_node'
require 'oga/xml/text' require 'oga/xml/text'
require 'oga/xml/comment' require 'oga/xml/comment'
require 'oga/xml/cdata' require 'oga/xml/cdata'
require 'oga/xml/xml_declaration'
require 'oga/xml/processing_instruction' require 'oga/xml/processing_instruction'
require 'oga/xml/xml_declaration'
require 'oga/xml/doctype' require 'oga/xml/doctype'
require 'oga/xml/namespace' require 'oga/xml/namespace'
require 'oga/xml/default_namespace' require 'oga/xml/default_namespace'

View File

@ -23,7 +23,7 @@ module Oga
# @param [String] data The data to lex. # @param [String] data The data to lex.
def initialize(data) def initialize(data)
@data = data @data = data.strip
end end
# Gathers all the tokens for the input and returns them as an Array. # Gathers all the tokens for the input and returns them as an Array.

View File

@ -1,3 +1,3 @@
module Oga module Oga
VERSION = '2.4' VERSION = '3.4'
end # Oga end # Oga

View File

@ -34,10 +34,11 @@ module Oga
# @option options [String] :value # @option options [String] :value
# @option options [Oga::XML::Element] :element # @option options [Oga::XML::Element] :element
def initialize(options = {}) def initialize(options = {})
@name = options[:name] @name = options[:name]
@value = options[:value] @value = options[:value]
@element = options[:element] @element = options[:element]
@decoded = false
@namespace = nil
@namespace_name = options[:namespace_name] @namespace_name = options[:namespace_name]
end end
@ -98,12 +99,14 @@ module Oga
end end
# @see [Oga::XML::Node#each_ancestor] # @see [Oga::XML::Node#each_ancestor]
def each_ancestor(&block) def each_ancestor
return to_enum(:each_ancestor) unless block_given?
return unless element return unless element
yield element yield element
element.each_ancestor(&block) element.each_ancestor { |ancestor| yield ancestor }
end end
private private

View File

@ -1,9 +1,7 @@
module Oga module Oga
module XML module XML
# Class used for storing information about Doctypes. # Class used for storing information about Doctypes.
class Doctype class Doctype < Node
include ToXML
# The name of the doctype (e.g. "HTML"). # The name of the doctype (e.g. "HTML").
# @return [String] # @return [String]
attr_accessor :name attr_accessor :name

View File

@ -7,6 +7,11 @@ module Oga
include Traversal include Traversal
include ToXML include ToXML
# The doctype of the document.
#
# When parsing a document this attribute will be set automatically if a
# doctype resides at the root of the document.
#
# @return [Oga::XML::Doctype] # @return [Oga::XML::Doctype]
attr_accessor :doctype attr_accessor :doctype
@ -41,6 +46,8 @@ module Oga
# @param [Oga::XML::NodeSet|Array] nodes # @param [Oga::XML::NodeSet|Array] nodes
def children=(nodes) def children=(nodes)
if nodes.is_a?(NodeSet) if nodes.is_a?(NodeSet)
nodes.owner = self
nodes.take_ownership_on_nodes
@children = nodes @children = nodes
else else
@children = NodeSet.new(nodes, self) @children = NodeSet.new(nodes, self)

View File

@ -34,10 +34,11 @@ module Oga
def initialize(options = {}) def initialize(options = {})
super super
@name = options[:name] @name = options[:name]
@namespace_name = options[:namespace_name] @namespace_name = options[:namespace_name]
@attributes = options[:attributes] || [] @attributes = options[:attributes] || []
@namespaces = options[:namespaces] || {} @namespaces = options[:namespaces] || {}
@available_namespaces = nil
link_attributes link_attributes
register_namespaces_from_attributes register_namespaces_from_attributes
@ -64,14 +65,14 @@ module Oga
# #
# @return [Oga::XML::Attribute] # @return [Oga::XML::Attribute]
def attribute(name) def attribute(name)
if html? name_str, ns = if html?
ns = nil [name.to_s, nil]
else else
name, ns = split_name(name) split_name(name)
end end
attributes.each do |attr| attributes.each do |attr|
return attr if attribute_matches?(attr, ns, name) return attr if attribute_matches?(attr, ns, name_str)
end end
return return
@ -91,6 +92,8 @@ module Oga
found ? found.value : nil found ? found.value : nil
end end
alias_method :[], :get
# Adds a new attribute to the element. # Adds a new attribute to the element.
# #
# @param [Oga::XML::Attribute] attribute # @param [Oga::XML::Attribute] attribute
@ -113,14 +116,10 @@ module Oga
if found if found
found.value = value found.value = value
else else
if name.include?(':') name_str, ns = split_name(name)
ns, name = name.split(':')
else
ns = nil
end
attr = Attribute.new( attr = Attribute.new(
:name => name, :name => name_str,
:namespace_name => ns, :namespace_name => ns,
:value => value :value => value
) )
@ -129,6 +128,8 @@ module Oga
end end
end end
alias_method :[]=, :set
# Removes an attribute from the element. # Removes an attribute from the element.
# #
# @param [String] name The name (optionally including namespace prefix) # @param [String] name The name (optionally including namespace prefix)

View File

@ -13,12 +13,12 @@ module Oga
# #
# @private # @private
class Generator class Generator
# @param [Oga::XML::Document|Oga::XML::Node] start The node to serialise. # @param [Oga::XML::Document|Oga::XML::Node] root The node to serialise.
def initialize(root) def initialize(root)
@start = root @start = root
if @start.respond_to?(:root_node) if @start.respond_to?(:html?)
@html_mode = @start.root_node.html? @html_mode = @start.html?
else else
@html_mode = false @html_mode = false
end end
@ -48,12 +48,14 @@ module Oga
callback = :on_comment callback = :on_comment
when Oga::XML::Attribute when Oga::XML::Attribute
callback = :on_attribute callback = :on_attribute
when Oga::XML::XmlDeclaration
# This must come before ProcessingInstruction since XmlDeclaration
# extends ProcessingInstruction.
callback = :on_xml_declaration
when Oga::XML::ProcessingInstruction when Oga::XML::ProcessingInstruction
callback = :on_processing_instruction callback = :on_processing_instruction
when Oga::XML::Doctype when Oga::XML::Doctype
callback = :on_doctype callback = :on_doctype
when Oga::XML::XmlDeclaration
callback = :on_xml_declaration
when Oga::XML::Document when Oga::XML::Document
callback = :on_document callback = :on_document
children = true children = true
@ -65,13 +67,24 @@ module Oga
if child_node = children && current.children[0] if child_node = children && current.children[0]
current = child_node current = child_node
elsif current == @start
# When we have reached the root node we should not process
# any of its siblings. If we did we'd include XML in the
# output from elements no part of the root node.
after_element(current, output) if current.is_a?(Element)
break
else else
# Make sure to always close the current element before
# moving to any siblings.
after_element(current, output) if current.is_a?(Element)
until next_node = current.is_a?(Node) && current.next until next_node = current.is_a?(Node) && current.next
if current.is_a?(Node) && current != @start if current.is_a?(Node) && current != @start
current = current.parent current = current.parent
end end
send(:after_element, current, output) if current.is_a?(Element) after_element(current, output) if current.is_a?(Element)
break if current == @start break if current == @start
end end
@ -112,7 +125,7 @@ module Oga
end end
# @param [Oga::XML::Element] element # @param [Oga::XML::Element] element
# @param [String] body The content of the element. # @param [String] output The content of the element.
def on_element(element, output) def on_element(element, output)
name = element.expanded_name name = element.expanded_name
attrs = '' attrs = ''
@ -123,7 +136,9 @@ module Oga
end end
if self_closing?(element) if self_closing?(element)
output << "<#{name}#{attrs} />" closing_tag = html_void_element?(element) ? '>' : ' />'
output << "<#{name}#{attrs}#{closing_tag}"
else else
output << "<#{name}#{attrs}>" output << "<#{name}#{attrs}>"
end end
@ -156,7 +171,7 @@ module Oga
output << '>' output << '>'
end end
# @param [Oga::XML::Document] node # @param [Oga::XML::Document] doc
# @param [String] output # @param [String] output
def on_document(doc, output) def on_document(doc, output)
if doc.xml_declaration if doc.xml_declaration
@ -168,6 +183,14 @@ module Oga
on_doctype(doc.doctype, output) on_doctype(doc.doctype, output)
output << "\n" output << "\n"
end end
first_child = doc.children[0]
# Prevent excessive newlines in case the next node is a newline text
# node.
if first_child.is_a?(Text) && first_child.text.start_with?("\r\n", "\n")
output.chomp!
end
end end
# @param [Oga::XML::XmlDeclaration] node # @param [Oga::XML::XmlDeclaration] node
@ -193,6 +216,10 @@ module Oga
element.children.empty? element.children.empty?
end end
end end
def html_void_element?(element)
@html_mode && HTML_VOID_ELEMENTS.allow?(element.name)
end
end end
end end
end end

View File

@ -58,7 +58,11 @@ module Oga
HTML_SCRIPT_ELEMENTS = Whitelist.new(%w{script template}) HTML_SCRIPT_ELEMENTS = Whitelist.new(%w{script template})
HTML_TABLE_ROW_ELEMENTS = Whitelist.new(%w{tr}) + HTML_SCRIPT_ELEMENTS # The elements that may occur in a thead, tbody, or tfoot.
#
# Technically "th" is not allowed per the HTML5 spec, but it's so commonly
# used in these elements that we allow it anyway.
HTML_TABLE_ROW_ELEMENTS = Whitelist.new(%w{tr th}) + HTML_SCRIPT_ELEMENTS
# Elements that should be closed automatically before a new opening tag is # Elements that should be closed automatically before a new opening tag is
# processed. # processed.

View File

@ -49,6 +49,8 @@ module Oga
# @param [Oga::XML::NodeSet|Array] nodes # @param [Oga::XML::NodeSet|Array] nodes
def children=(nodes) def children=(nodes)
if nodes.is_a?(NodeSet) if nodes.is_a?(NodeSet)
nodes.owner = self
nodes.take_ownership_on_nodes
@children = nodes @children = nodes
else else
@children = NodeSet.new(nodes, self) @children = NodeSet.new(nodes, self)
@ -180,6 +182,8 @@ module Oga
# #
# @yieldparam [Oga::XML::Node] # @yieldparam [Oga::XML::Node]
def each_ancestor def each_ancestor
return to_enum(:each_ancestor) unless block_given?
node = parent node = parent
while node.is_a?(XML::Element) while node.is_a?(XML::Element)

View File

@ -42,17 +42,15 @@ module Oga
@owner = owner @owner = owner
@existing = {} @existing = {}
@nodes.each_with_index do |node, index| take_ownership_on_nodes
mark_existing(node)
take_ownership(node, index) if @owner
end
end end
# Yields the supplied block for every node. # Yields the supplied block for every node.
# #
# @yieldparam [Oga::XML::Node] # @yieldparam [Oga::XML::Node]
def each def each
return to_enum(:each) unless block_given?
@nodes.each { |node| yield node } @nodes.each { |node| yield node }
end end
@ -287,6 +285,14 @@ module Oga
"NodeSet(#{values})" "NodeSet(#{values})"
end end
def take_ownership_on_nodes
@nodes.each_with_index do |node, index|
mark_existing(node)
take_ownership(node, index) if @owner
end
end
private private
# Takes ownership of the given node. This only occurs when the current # Takes ownership of the given node. This only occurs when the current

View File

@ -60,35 +60,20 @@ expression
# <!DOCTYPE html [ ... ]> # <!DOCTYPE html [ ... ]>
doctype doctype
= T_DOCTYPE_START T_DOCTYPE_NAME doctype_follow = T_DOCTYPE_START T_DOCTYPE_NAME T_DOCTYPE_TYPE? string? string? doctype_inline T_DOCTYPE_END
{ {
name = val[1]
follow = val[2]
on_doctype( on_doctype(
:name => name, :name => val[1],
:type => follow[0], :type => val[2],
:public_id => follow[1], :public_id => val[3],
:system_id => follow[2], :system_id => val[4],
:inline_rules => follow[3] :inline_rules => val[5]
) )
} }
; ;
# Returns: [T_DOCTYPE_TYPE, string, string, doctype_inline]
doctype_follow
= T_DOCTYPE_END { [] }
| T_DOCTYPE_TYPE doctype_types { [val[0], *val[1]] }
| doctype_inline T_DOCTYPE_END { [nil, nil, nil, val[0]] }
;
doctype_inline doctype_inline
= T_DOCTYPE_INLINE+ { val[0].inject(:+) } = T_DOCTYPE_INLINE* { val[0].inject(:+) }
;
doctype_types
= string string? T_DOCTYPE_END { [val[0], val[1]] }
| T_DOCTYPE_END { nil }
; ;
# CDATA tags # CDATA tags

View File

@ -10,6 +10,7 @@ module Oga
# document = Oga.parse_xml <<-EOF # document = Oga.parse_xml <<-EOF
# <people> # <people>
# <person age="25">Alice</person> # <person age="25">Alice</person>
# <ns:person xmlns:ns="http://example.net">Bob</ns:person>
# </people> # </people>
# EOF # EOF
# #
@ -25,15 +26,23 @@ module Oga
# #
# document.xpath('people/person[@age = $age]', 'age' => 25) # document.xpath('people/person[@age = $age]', 'age' => 25)
# #
# Using namespace aliases:
#
# namespaces = {'example' => 'http://example.net'}
# document.xpath('people/example:person', namespaces: namespaces)
#
# @param [String] expression The XPath expression to run. # @param [String] expression The XPath expression to run.
# #
# @param [Hash] variables Variables to bind. The keys of this Hash should # @param [Hash] variables Variables to bind. The keys of this Hash should
# be String values. # be String values.
# #
# @param [Hash] namespaces Namespace aliases. The keys of this Hash should
# be String values.
#
# @return [Oga::XML::NodeSet] # @return [Oga::XML::NodeSet]
def xpath(expression, variables = {}) def xpath(expression, variables = {}, namespaces: nil)
ast = XPath::Parser.parse_with_cache(expression) ast = XPath::Parser.parse_with_cache(expression)
block = XPath::Compiler.compile_with_cache(ast) block = XPath::Compiler.compile_with_cache(ast, namespaces: namespaces)
block.call(self, variables) block.call(self, variables)
end end
@ -54,8 +63,8 @@ module Oga
# #
# @see [#xpath] # @see [#xpath]
# @return [Oga::XML::Node|Oga::XML::Attribute] # @return [Oga::XML::Node|Oga::XML::Attribute]
def at_xpath(*args) def at_xpath(*args, namespaces: nil)
result = xpath(*args) result = xpath(*args, namespaces: namespaces)
result.is_a?(XML::NodeSet) ? result.first : result result.is_a?(XML::NodeSet) ? result.first : result
end end

View File

@ -74,18 +74,7 @@ module Oga
super(*args) super(*args)
end end
# Delegate all callbacks to the handler object. # Manually define `on_element` so we can ensure that `after_element`
instance_methods.grep(/^(on_|after_)/).each do |method|
eval <<-EOF, nil, __FILE__, __LINE__ + 1
def #{method}(*args)
run_callback(:#{method}, *args)
return
end
EOF
end
# Manually overwrite `on_element` so we can ensure that `after_element`
# always receives the namespace and name. # always receives the namespace and name.
# #
# @see [Oga::XML::Parser#on_element] # @see [Oga::XML::Parser#on_element]
@ -96,7 +85,7 @@ module Oga
[namespace, name] [namespace, name]
end end
# Manually overwrite `after_element` so it can take a namespace and name. # Manually define `after_element` so it can take a namespace and name.
# This differs a bit from the regular `after_element` which only takes an # This differs a bit from the regular `after_element` which only takes an
# {Oga::XML::Element} instance. # {Oga::XML::Element} instance.
# #
@ -107,7 +96,7 @@ module Oga
return return
end end
# Manually overwrite this method since for this one we _do_ want the # Manually define this method since for this one we _do_ want the
# return value so it can be passed to `on_element`. # return value so it can be passed to `on_element`.
# #
# @see [Oga::XML::Parser#on_attribute] # @see [Oga::XML::Parser#on_attribute]
@ -157,6 +146,21 @@ module Oga
return return
end end
# Delegate remaining callbacks to the handler object.
existing_methods = instance_methods(false)
instance_methods.grep(/^(on_|after_)/).each do |method|
next if existing_methods.include?(method)
eval <<-EOF, nil, __FILE__, __LINE__ + 1
def #{method}(*args)
run_callback(:#{method}, *args)
return
end
EOF
end
private private
# @return [TrueClass|FalseClass] # @return [TrueClass|FalseClass]

View File

@ -7,6 +7,8 @@ module Oga
def to_xml def to_xml
Generator.new(self).to_xml Generator.new(self).to_xml
end end
alias_method :to_s, :to_xml
end end
end end
end end

View File

@ -27,6 +27,8 @@ module Oga
# #
# @yieldparam [Oga::XML::Node] The current node. # @yieldparam [Oga::XML::Node] The current node.
def each_node def each_node
return to_enum(:each_node) unless block_given?
visit = children.to_a.reverse visit = children.to_a.reverse
until visit.empty? until visit.empty?

View File

@ -1,9 +1,7 @@
module Oga module Oga
module XML module XML
# Class containing information about an XML declaration tag. # Class containing information about an XML declaration tag.
class XmlDeclaration class XmlDeclaration < ProcessingInstruction
include ToXML
# @return [String] # @return [String]
attr_accessor :version attr_accessor :version
@ -20,9 +18,12 @@ module Oga
# @option options [String] :encoding # @option options [String] :encoding
# @option options [String] :standalone # @option options [String] :standalone
def initialize(options = {}) def initialize(options = {})
super
@version = options[:version] || '1.0' @version = options[:version] || '1.0'
@encoding = options[:encoding] || 'UTF-8' @encoding = options[:encoding] || 'UTF-8'
@standalone = options[:standalone] @standalone = options[:standalone]
@name = 'xml'
end end
# @return [String] # @return [String]

View File

@ -42,12 +42,16 @@ module Oga
# Compiles and caches an AST. # Compiles and caches an AST.
# #
# @see [#compile] # @see [#compile]
def self.compile_with_cache(ast) def self.compile_with_cache(ast, namespaces: nil)
CACHE.get_or_set(ast) { new.compile(ast) } cache_key = namespaces ? [ast, namespaces] : ast
CACHE.get_or_set(cache_key) { new(namespaces: namespaces).compile(ast) }
end end
def initialize # @param [Hash] namespaces
def initialize(namespaces: nil)
reset reset
@namespaces = namespaces
end end
# Resets the internal state. # Resets the internal state.
@ -1385,7 +1389,23 @@ module Oga
end end
if ns and ns != STAR if ns and ns != STAR
ns_match = input.namespace_name.eq(string(ns)) if @namespaces
ns_uri = @namespaces[ns]
ns_match =
if ns_uri
input.namespace.and(input.namespace.uri.eq(string(ns_uri)))
else
self.false
end
else
ns_match =
if ns == XML::Element::XMLNS_PREFIX
input
else
input.namespace_name.eq(string(ns))
end
end
condition = condition ? condition.and(ns_match) : ns_match condition = condition ? condition.and(ns_match) : ns_match
end end

View File

@ -76,7 +76,7 @@ module Oga
if value.is_a?(Float) if value.is_a?(Float)
bool = !value.nan? && !value.zero? bool = !value.nan? && !value.zero?
elsif value.is_a?(Fixnum) elsif value.is_a?(Integer)
bool = !value.zero? bool = !value.zero?
elsif value.respond_to?(:empty?) elsif value.respond_to?(:empty?)
bool = !value.empty? bool = !value.empty?

BIN
oga-3.4.gem Normal file

Binary file not shown.

View File

@ -6,7 +6,7 @@ Gem::Specification.new do |s|
s.authors = ['Yorick Peterse'] s.authors = ['Yorick Peterse']
s.email = 'yorickpeterse@gmail.com' s.email = 'yorickpeterse@gmail.com'
s.summary = 'Oga is an XML/HTML parser written in Ruby.' s.summary = 'Oga is an XML/HTML parser written in Ruby.'
s.homepage = 'https://github.com/yorickpeterse/oga/' s.homepage = 'https://gitlab.com/yorickpeterse/oga/'
s.description = s.summary s.description = s.summary
s.license = 'MPL-2.0' s.license = 'MPL-2.0'
@ -29,7 +29,6 @@ Gem::Specification.new do |s|
s.extensions = ['ext/c/extconf.rb'] s.extensions = ['ext/c/extconf.rb']
end end
s.has_rdoc = 'yard'
s.required_ruby_version = '>= 1.9.3' s.required_ruby_version = '>= 1.9.3'
s.add_dependency 'ast' s.add_dependency 'ast'

View File

@ -14,15 +14,15 @@ describe Oga::Blacklist do
it 'returns true for a name not in the list' do it 'returns true for a name not in the list' do
list = described_class.new(%w{foo}) list = described_class.new(%w{foo})
list.allow?('bar').should == true expect(list.allow?('bar')).to eq(true)
list.allow?('BAR').should == true expect(list.allow?('BAR')).to eq(true)
end end
it 'returns false for a name in the list' do it 'returns false for a name in the list' do
list = described_class.new(%w{foo}) list = described_class.new(%w{foo})
list.allow?('foo').should == false expect(list.allow?('foo')).to eq(false)
list.allow?('FOO').should == false expect(list.allow?('FOO')).to eq(false)
end end
end end
@ -32,8 +32,8 @@ describe Oga::Blacklist do
list2 = described_class.new(%w{bar}) list2 = described_class.new(%w{bar})
list3 = list1 + list2 list3 = list1 + list2
list3.should be_an_instance_of(described_class) expect(list3).to be_an_instance_of(described_class)
list3.names.to_a.should == %w{foo FOO bar BAR} expect(list3.names.to_a).to eq(%w{foo FOO bar BAR})
end end
end end
end end

View File

@ -10,15 +10,15 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing direct child nodes' do it 'returns a node set containing direct child nodes' do
evaluate_css(@document, 'root > a').should == node_set(@a1) expect(evaluate_css(@document, 'root > a')).to eq(node_set(@a1))
end end
it 'returns a node set containing direct child nodes relative to a node' do it 'returns a node set containing direct child nodes relative to a node' do
evaluate_css(@a1, '> a').should == @a1.children expect(evaluate_css(@a1, '> a')).to eq(@a1.children)
end end
it 'returns an empty node set for non matching child nodes' do it 'returns an empty node set for non matching child nodes' do
evaluate_css(@document, '> a').should == node_set expect(evaluate_css(@document, '> a')).to eq(node_set)
end end
end end
@ -31,15 +31,15 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing following siblings' do it 'returns a node set containing following siblings' do
evaluate_css(@document, 'root a + b').should == node_set(@b1) expect(evaluate_css(@document, 'root a + b')).to eq(node_set(@b1))
end end
it 'returns a node set containing following siblings relatie to a node' do it 'returns a node set containing following siblings relatie to a node' do
evaluate_css(@b1, '+ b').should == node_set(@b2) expect(evaluate_css(@b1, '+ b')).to eq(node_set(@b2))
end end
it 'returns an empty node set for non matching following siblings' do it 'returns an empty node set for non matching following siblings' do
evaluate_css(@document, 'root a + c').should == node_set expect(evaluate_css(@document, 'root a + c')).to eq(node_set)
end end
end end
@ -52,15 +52,15 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing following siblings' do it 'returns a node set containing following siblings' do
evaluate_css(@document, 'root a ~ b').should == node_set(@b1, @b2) expect(evaluate_css(@document, 'root a ~ b')).to eq(node_set(@b1, @b2))
end end
it 'returns a node set containing following siblings relative to a node' do it 'returns a node set containing following siblings relative to a node' do
evaluate_css(@b1, '~ b').should == node_set(@b2) expect(evaluate_css(@b1, '~ b')).to eq(node_set(@b2))
end end
it 'returns an empty node set for non matching following siblings' do it 'returns an empty node set for non matching following siblings' do
evaluate_css(@document, 'root a ~ c').should == node_set expect(evaluate_css(@document, 'root a ~ c')).to eq(node_set)
end end
end end
end end

View File

@ -5,25 +5,25 @@ describe 'CSS selector evaluation' do
it 'returns a node set containing a node with a single class' do it 'returns a node set containing a node with a single class' do
document = parse('<x class="foo" />') document = parse('<x class="foo" />')
evaluate_css(document, '.foo').should == document.children expect(evaluate_css(document, '.foo')).to eq(document.children)
end end
it 'returns a node set containing a node having one of two classes' do it 'returns a node set containing a node having one of two classes' do
document = parse('<x class="foo bar" />') document = parse('<x class="foo bar" />')
evaluate_css(document, '.foo').should == document.children expect(evaluate_css(document, '.foo')).to eq(document.children)
end end
it 'returns a node set containing a node having both classes' do it 'returns a node set containing a node having both classes' do
document = parse('<x class="foo bar" />') document = parse('<x class="foo bar" />')
evaluate_css(document, '.foo.bar').should == document.children expect(evaluate_css(document, '.foo.bar')).to eq(document.children)
end end
it 'returns an empty node set for non matching classes' do it 'returns an empty node set for non matching classes' do
document = parse('<x class="bar" />') document = parse('<x class="bar" />')
evaluate_css(document, '.foo').should == node_set expect(evaluate_css(document, '.foo')).to eq(node_set)
end end
end end
end end

View File

@ -7,11 +7,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing a node with a single ID' do it 'returns a node set containing a node with a single ID' do
evaluate_css(@document, '#foo').should == @document.children expect(evaluate_css(@document, '#foo')).to eq(@document.children)
end end
it 'returns an empty node set for non matching IDs' do it 'returns an empty node set for non matching IDs' do
evaluate_css(@document, '#bar').should == node_set expect(evaluate_css(@document, '#bar')).to eq(node_set)
end end
end end
end end

View File

@ -8,11 +8,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing nodes with matching attributes' do it 'returns a node set containing nodes with matching attributes' do
evaluate_css(@document, 'x[a = "b"]').should == @document.children expect(evaluate_css(@document, 'x[a = "b"]')).to eq(@document.children)
end end
it 'returns an empty node set for non matching attribute values' do it 'returns an empty node set for non matching attribute values' do
evaluate_css(@document, 'x[a = "c"]').should == node_set expect(evaluate_css(@document, 'x[a = "c"]')).to eq(node_set)
end end
end end
@ -20,19 +20,19 @@ describe 'CSS selector evaluation' do
it 'returns a node set containing nodes with matching attributes' do it 'returns a node set containing nodes with matching attributes' do
document = parse('<x a="1 2 3" />') document = parse('<x a="1 2 3" />')
evaluate_css(document, 'x[a ~= "2"]').should == document.children expect(evaluate_css(document, 'x[a ~= "2"]')).to eq(document.children)
end end
it 'returns a node set containing nodes with single attribute values' do it 'returns a node set containing nodes with single attribute values' do
document = parse('<x a="1" />') document = parse('<x a="1" />')
evaluate_css(document, 'x[a ~= "1"]').should == document.children expect(evaluate_css(document, 'x[a ~= "1"]')).to eq(document.children)
end end
it 'returns an empty node set for non matching attributes' do it 'returns an empty node set for non matching attributes' do
document = parse('<x a="1 2 3" />') document = parse('<x a="1 2 3" />')
evaluate_css(document, 'x[a ~= "4"]').should == node_set expect(evaluate_css(document, 'x[a ~= "4"]')).to eq(node_set)
end end
end end
@ -42,11 +42,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing nodes with matching attributes' do it 'returns a node set containing nodes with matching attributes' do
evaluate_css(@document, 'x[a ^= "fo"]').should == @document.children expect(evaluate_css(@document, 'x[a ^= "fo"]')).to eq(@document.children)
end end
it 'returns an empty node set for non matching attributes' do it 'returns an empty node set for non matching attributes' do
evaluate_css(@document, 'x[a ^= "bar"]').should == node_set expect(evaluate_css(@document, 'x[a ^= "bar"]')).to eq(node_set)
end end
end end
@ -56,11 +56,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing nodes with matching attributes' do it 'returns a node set containing nodes with matching attributes' do
evaluate_css(@document, 'x[a $= "oo"]').should == @document.children expect(evaluate_css(@document, 'x[a $= "oo"]')).to eq(@document.children)
end end
it 'returns an empty node set for non matching attributes' do it 'returns an empty node set for non matching attributes' do
evaluate_css(@document, 'x[a $= "x"]').should == node_set expect(evaluate_css(@document, 'x[a $= "x"]')).to eq(node_set)
end end
end end
@ -70,11 +70,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing nodes with matching attributes' do it 'returns a node set containing nodes with matching attributes' do
evaluate_css(@document, 'x[a *= "o"]').should == @document.children expect(evaluate_css(@document, 'x[a *= "o"]')).to eq(@document.children)
end end
it 'returns an empty node set for non matching attributes' do it 'returns an empty node set for non matching attributes' do
evaluate_css(@document, 'x[a *= "x"]').should == node_set expect(evaluate_css(@document, 'x[a *= "x"]')).to eq(node_set)
end end
end end
@ -82,19 +82,19 @@ describe 'CSS selector evaluation' do
it 'returns a node set containing nodes with matching attributes' do it 'returns a node set containing nodes with matching attributes' do
document = parse('<x a="foo-bar" />') document = parse('<x a="foo-bar" />')
evaluate_css(document, 'x[a |= "foo"]').should == document.children expect(evaluate_css(document, 'x[a |= "foo"]')).to eq(document.children)
end end
it 'returns a node set containing nodes with single attribute values' do it 'returns a node set containing nodes with single attribute values' do
document = parse('<x a="foo" />') document = parse('<x a="foo" />')
evaluate_css(document, 'x[a |= "foo"]').should == document.children expect(evaluate_css(document, 'x[a |= "foo"]')).to eq(document.children)
end end
it 'returns an empty node set for non matching attributes' do it 'returns an empty node set for non matching attributes' do
document = parse('<x a="bar" />') document = parse('<x a="bar" />')
evaluate_css(document, 'x[a |= "foo"]').should == node_set expect(evaluate_css(document, 'x[a |= "foo"]')).to eq(node_set)
end end
end end
end end

View File

@ -12,23 +12,23 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the root node' do it 'returns a node set containing the root node' do
evaluate_css(@document, 'a').should == node_set(@a1) expect(evaluate_css(@document, 'a')).to eq(node_set(@a1))
end end
it 'returns a node set containing nested nodes' do it 'returns a node set containing nested nodes' do
evaluate_css(@document, 'a b').should == node_set(@b1, @b2) expect(evaluate_css(@document, 'a b')).to eq(node_set(@b1, @b2))
end end
it 'returns a node set containing the union of multiple paths' do it 'returns a node set containing the union of multiple paths' do
evaluate_css(@document, 'b, ns1|c').should == node_set(@b1, @b2, @c1) expect(evaluate_css(@document, 'b, ns1|c')).to eq(node_set(@b1, @b2, @c1))
end end
it 'returns a node set containing namespaced nodes' do it 'returns a node set containing namespaced nodes' do
evaluate_css(@document, 'a ns1|c').should == node_set(@c1) expect(evaluate_css(@document, 'a ns1|c')).to eq(node_set(@c1))
end end
it 'returns a node set containing wildcard nodes' do it 'returns a node set containing wildcard nodes' do
evaluate_css(@document, 'a *').should == node_set(@b1, @b2, @c1) expect(evaluate_css(@document, 'a *')).to eq(node_set(@b1, @b2, @c1))
end end
it 'returns a node set containing nodes with namespace wildcards' do it 'returns a node set containing nodes with namespace wildcards' do
@ -36,11 +36,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing nodes with a namespace name and name wildcard' do it 'returns a node set containing nodes with a namespace name and name wildcard' do
evaluate_css(@document, 'a ns1|*').should == node_set(@c1) expect(evaluate_css(@document, 'a ns1|*')).to eq(node_set(@c1))
end end
it 'returns a node set containing nodes using full wildcards' do it 'returns a node set containing nodes using full wildcards' do
evaluate_css(@document, 'a *|*').should == node_set(@b1, @b2, @c1) expect(evaluate_css(@document, 'a *|*')).to eq(node_set(@b1, @b2, @c1))
end end
end end
end end

View File

@ -9,11 +9,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing nodes with an attribute' do it 'returns a node set containing nodes with an attribute' do
evaluate_css(@document, 'root a[class]').should == node_set(@a1) expect(evaluate_css(@document, 'root a[class]')).to eq(node_set(@a1))
end end
it 'returns a node set containing nodes with a matching attribute value' do it 'returns a node set containing nodes with a matching attribute value' do
evaluate_css(@document, 'root a[class="foo"]').should == node_set(@a1) expect(evaluate_css(@document, 'root a[class="foo"]')).to eq(node_set(@a1))
end end
end end
end end

View File

@ -10,15 +10,15 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing empty nodes' do it 'returns a node set containing empty nodes' do
evaluate_css(@document, 'root :empty').should == node_set(@a1) expect(evaluate_css(@document, 'root :empty')).to eq(node_set(@a1))
end end
it 'returns a node set containing empty nodes with a node test' do it 'returns a node set containing empty nodes with a node test' do
evaluate_css(@document, 'root a:empty').should == node_set(@a1) expect(evaluate_css(@document, 'root a:empty')).to eq(node_set(@a1))
end end
it 'returns an empty node set containing non empty nodes' do it 'returns an empty node set containing non empty nodes' do
evaluate_css(@document, 'root b:empty').should == node_set expect(evaluate_css(@document, 'root b:empty')).to eq(node_set)
end end
end end
end end

View File

@ -10,15 +10,15 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the first child node' do it 'returns a node set containing the first child node' do
evaluate_css(@document, 'root :first-child').should == node_set(@a1) expect(evaluate_css(@document, 'root :first-child')).to eq(node_set(@a1))
end end
it 'returns a node set containing the first child node with a node test' do it 'returns a node set containing the first child node with a node test' do
evaluate_css(@document, 'root a:first-child').should == node_set(@a1) expect(evaluate_css(@document, 'root a:first-child')).to eq(node_set(@a1))
end end
it 'returns an empty node set for non first-child nodes' do it 'returns an empty node set for non first-child nodes' do
evaluate_css(@document, 'root b:first-child').should == node_set expect(evaluate_css(@document, 'root b:first-child')).to eq(node_set)
end end
end end
end end

View File

@ -18,8 +18,8 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing all first <a> nodes' do it 'returns a node set containing all first <a> nodes' do
evaluate_css(@document, 'root a:first-of-type') expect(evaluate_css(@document, 'root a:first-of-type'))
.should == node_set(@a1, @a3) .to eq(node_set(@a1, @a3))
end end
end end
end end

View File

@ -10,15 +10,15 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the last child node' do it 'returns a node set containing the last child node' do
evaluate_css(@document, 'root :last-child').should == node_set(@b1) expect(evaluate_css(@document, 'root :last-child')).to eq(node_set(@b1))
end end
it 'returns a node set containing the last child node with a node test' do it 'returns a node set containing the last child node with a node test' do
evaluate_css(@document, 'root b:last-child').should == node_set(@b1) expect(evaluate_css(@document, 'root b:last-child')).to eq(node_set(@b1))
end end
it 'returns an empty node set for non last-child nodes' do it 'returns an empty node set for non last-child nodes' do
evaluate_css(@document, 'root a:last-child').should == node_set expect(evaluate_css(@document, 'root a:last-child')).to eq(node_set)
end end
end end
end end

View File

@ -18,8 +18,8 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing all last <a> nodes' do it 'returns a node set containing all last <a> nodes' do
evaluate_css(@document, 'root a:last-of-type') expect(evaluate_css(@document, 'root a:last-of-type'))
.should == node_set(@a2, @a4) .to eq(node_set(@a2, @a4))
end end
end end
end end

View File

@ -13,36 +13,36 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the first child node' do it 'returns a node set containing the first child node' do
evaluate_css(@document, 'root :nth-child(1)').should == node_set(@a1) expect(evaluate_css(@document, 'root :nth-child(1)')).to eq(node_set(@a1))
end end
it 'returns a node set containing even nodes' do it 'returns a node set containing even nodes' do
evaluate_css(@document, 'root :nth-child(even)') expect(evaluate_css(@document, 'root :nth-child(even)'))
.should == node_set(@a2, @a4) .to eq(node_set(@a2, @a4))
end end
it 'returns a node set containing odd nodes' do it 'returns a node set containing odd nodes' do
evaluate_css(@document, 'root :nth-child(odd)') expect(evaluate_css(@document, 'root :nth-child(odd)'))
.should == node_set(@a1, @a3) .to eq(node_set(@a1, @a3))
end end
it 'returns a node set containing every 2 nodes starting at node 2' do it 'returns a node set containing every 2 nodes starting at node 2' do
evaluate_css(@document, 'root :nth-child(2n+2)') expect(evaluate_css(@document, 'root :nth-child(2n+2)'))
.should == node_set(@a2, @a4) .to eq(node_set(@a2, @a4))
end end
it 'returns a node set containing all nodes' do it 'returns a node set containing all nodes' do
evaluate_css(@document, 'root :nth-child(n)').should == @root.children expect(evaluate_css(@document, 'root :nth-child(n)')).to eq(@root.children)
end end
it 'returns a node set containing the first two nodes' do it 'returns a node set containing the first two nodes' do
evaluate_css(@document, 'root :nth-child(-n+2)') expect(evaluate_css(@document, 'root :nth-child(-n+2)'))
.should == node_set(@a1, @a2) .to eq(node_set(@a1, @a2))
end end
it 'returns a node set containing all nodes starting at node 2' do it 'returns a node set containing all nodes starting at node 2' do
evaluate_css(@document, 'root :nth-child(n+2)') expect(evaluate_css(@document, 'root :nth-child(n+2)'))
.should == node_set(@a2, @a3, @a4) .to eq(node_set(@a2, @a3, @a4))
end end
end end
end end

View File

@ -13,37 +13,37 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the last child node' do it 'returns a node set containing the last child node' do
evaluate_css(@document, 'root :nth-last-child(1)').should == node_set(@a4) expect(evaluate_css(@document, 'root :nth-last-child(1)')).to eq(node_set(@a4))
end end
it 'returns a node set containing even nodes' do it 'returns a node set containing even nodes' do
evaluate_css(@document, 'root :nth-last-child(even)') expect(evaluate_css(@document, 'root :nth-last-child(even)'))
.should == node_set(@a1, @a3) .to eq(node_set(@a1, @a3))
end end
it 'returns a node set containing odd nodes' do it 'returns a node set containing odd nodes' do
evaluate_css(@document, 'root :nth-last-child(odd)') expect(evaluate_css(@document, 'root :nth-last-child(odd)'))
.should == node_set(@a2, @a4) .to eq(node_set(@a2, @a4))
end end
it 'returns a node set containing every 2 nodes starting at node 3' do it 'returns a node set containing every 2 nodes starting at node 3' do
evaluate_css(@document, 'root :nth-last-child(2n+2)') expect(evaluate_css(@document, 'root :nth-last-child(2n+2)'))
.should == node_set(@a1, @a3) .to eq(node_set(@a1, @a3))
end end
it 'returns a node set containing all nodes' do it 'returns a node set containing all nodes' do
evaluate_css(@document, 'root :nth-last-child(n)') expect(evaluate_css(@document, 'root :nth-last-child(n)'))
.should == @root.children .to eq(@root.children)
end end
it 'returns a node set containing the first two nodes' do it 'returns a node set containing the first two nodes' do
evaluate_css(@document, 'root :nth-last-child(-n+2)') expect(evaluate_css(@document, 'root :nth-last-child(-n+2)'))
.should == node_set(@a3, @a4) .to eq(node_set(@a3, @a4))
end end
it 'returns a node set containing all nodes starting at node 2' do it 'returns a node set containing all nodes starting at node 2' do
evaluate_css(@document, 'root :nth-last-child(n+2)') expect(evaluate_css(@document, 'root :nth-last-child(n+2)'))
.should == node_set(@a1, @a2, @a3) .to eq(node_set(@a1, @a2, @a3))
end end
end end
end end

View File

@ -22,38 +22,38 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the first child node' do it 'returns a node set containing the first child node' do
evaluate_css(@document, 'root a:nth-last-of-type(1)') expect(evaluate_css(@document, 'root a:nth-last-of-type(1)'))
.should == node_set(@a3, @a4) .to eq(node_set(@a3, @a4))
end end
it 'returns a node set containing even nodes' do it 'returns a node set containing even nodes' do
evaluate_css(@document, 'root a:nth-last-of-type(even)') expect(evaluate_css(@document, 'root a:nth-last-of-type(even)'))
.should == node_set(@a2) .to eq(node_set(@a2))
end end
it 'returns a node set containing odd nodes' do it 'returns a node set containing odd nodes' do
evaluate_css(@document, 'root a:nth-last-of-type(odd)') expect(evaluate_css(@document, 'root a:nth-last-of-type(odd)'))
.should == node_set(@a1, @a3, @a4) .to eq(node_set(@a1, @a3, @a4))
end end
it 'returns a node set containing every 2 nodes starting at node 2' do it 'returns a node set containing every 2 nodes starting at node 2' do
evaluate_css(@document, 'root a:nth-last-of-type(2n+2)') expect(evaluate_css(@document, 'root a:nth-last-of-type(2n+2)'))
.should == node_set(@a2) .to eq(node_set(@a2))
end end
it 'returns a node set containing all nodes' do it 'returns a node set containing all nodes' do
evaluate_css(@document, 'root a:nth-last-of-type(n)') expect(evaluate_css(@document, 'root a:nth-last-of-type(n)'))
.should == node_set(@a1, @a2, @a3, @a4) .to eq(node_set(@a1, @a2, @a3, @a4))
end end
it 'returns a node set containing the first two nodes' do it 'returns a node set containing the first two nodes' do
evaluate_css(@document, 'root a:nth-last-of-type(-n+2)') expect(evaluate_css(@document, 'root a:nth-last-of-type(-n+2)'))
.should == node_set(@a2, @a3, @a4) .to eq(node_set(@a2, @a3, @a4))
end end
it 'returns a node set containing all nodes starting at node 2' do it 'returns a node set containing all nodes starting at node 2' do
evaluate_css(@document, 'root a:nth-last-of-type(n+2)') expect(evaluate_css(@document, 'root a:nth-last-of-type(n+2)'))
.should == node_set(@a1, @a2) .to eq(node_set(@a1, @a2))
end end
end end
end end

View File

@ -22,38 +22,38 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the first child node' do it 'returns a node set containing the first child node' do
evaluate_css(@document, 'root a:nth-of-type(1)') expect(evaluate_css(@document, 'root a:nth-of-type(1)'))
.should == node_set(@a1, @a4) .to eq(node_set(@a1, @a4))
end end
it 'returns a node set containing even nodes' do it 'returns a node set containing even nodes' do
evaluate_css(@document, 'root a:nth-of-type(even)') expect(evaluate_css(@document, 'root a:nth-of-type(even)'))
.should == node_set(@a2) .to eq(node_set(@a2))
end end
it 'returns a node set containing odd nodes' do it 'returns a node set containing odd nodes' do
evaluate_css(@document, 'root a:nth-of-type(odd)') expect(evaluate_css(@document, 'root a:nth-of-type(odd)'))
.should == node_set(@a1, @a3, @a4) .to eq(node_set(@a1, @a3, @a4))
end end
it 'returns a node set containing every 2 nodes starting at node 2' do it 'returns a node set containing every 2 nodes starting at node 2' do
evaluate_css(@document, 'root a:nth-of-type(2n+2)') expect(evaluate_css(@document, 'root a:nth-of-type(2n+2)'))
.should == node_set(@a2) .to eq(node_set(@a2))
end end
it 'returns a node set containing all nodes' do it 'returns a node set containing all nodes' do
evaluate_css(@document, 'root a:nth-of-type(n)') expect(evaluate_css(@document, 'root a:nth-of-type(n)'))
.should == node_set(@a1, @a2, @a3, @a4) .to eq(node_set(@a1, @a2, @a3, @a4))
end end
it 'returns a node set containing the first two nodes' do it 'returns a node set containing the first two nodes' do
evaluate_css(@document, 'root a:nth-of-type(-n+2)') expect(evaluate_css(@document, 'root a:nth-of-type(-n+2)'))
.should == node_set(@a1, @a2, @a4) .to eq(node_set(@a1, @a2, @a4))
end end
it 'returns a node set containing all nodes starting at node 2' do it 'returns a node set containing all nodes starting at node 2' do
evaluate_css(@document, 'root a:nth-of-type(n+2)') expect(evaluate_css(@document, 'root a:nth-of-type(n+2)'))
.should == node_set(@a2, @a3) .to eq(node_set(@a2, @a3))
end end
end end
end end

View File

@ -11,11 +11,11 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the first <a> node' do it 'returns a node set containing the first <a> node' do
evaluate_css(@document, 'root a:nth(1)').should == node_set(@a1) expect(evaluate_css(@document, 'root a:nth(1)')).to eq(node_set(@a1))
end end
it 'returns a node set containing the second <a> node' do it 'returns a node set containing the second <a> node' do
evaluate_css(@document, 'root a:nth(2)').should == node_set(@a2) expect(evaluate_css(@document, 'root a:nth(2)')).to eq(node_set(@a2))
end end
end end
end end

View File

@ -11,7 +11,7 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing <c> nodes' do it 'returns a node set containing <c> nodes' do
evaluate_css(@document, 'root :only-child').should == node_set(@c1, @c2) expect(evaluate_css(@document, 'root :only-child')).to eq(node_set(@c1, @c2))
end end
end end
end end

View File

@ -11,7 +11,7 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing <c> nodes' do it 'returns a node set containing <c> nodes' do
evaluate_css(@document, 'root a :only-of-type').should == node_set(@c1) expect(evaluate_css(@document, 'root a :only-of-type')).to eq(node_set(@c1))
end end
end end
end end

View File

@ -7,7 +7,7 @@ describe 'CSS selector evaluation' do
end end
it 'returns a node set containing the root node' do it 'returns a node set containing the root node' do
evaluate_css(@document, ':root').should == @document.children expect(evaluate_css(@document, ':root')).to eq(@document.children)
end end
end end
end end

View File

@ -3,63 +3,63 @@ require 'spec_helper'
describe Oga::CSS::Lexer do describe Oga::CSS::Lexer do
describe 'axes' do describe 'axes' do
it 'lexes the > axis' do it 'lexes the > axis' do
lex_css('>').should == [[:T_GREATER, nil]] expect(lex_css('>')).to eq([[:T_GREATER, nil]])
end end
it 'lexes the expression "> y"' do it 'lexes the expression "> y"' do
lex_css('> y').should == [[:T_GREATER, nil], [:T_IDENT, 'y']] expect(lex_css('> y')).to eq([[:T_GREATER, nil], [:T_IDENT, 'y']])
end end
it 'lexes the expression "x > y"' do it 'lexes the expression "x > y"' do
lex_css('x > y').should == [ expect(lex_css('x > y')).to eq([
[:T_IDENT, 'x'], [:T_IDENT, 'x'],
[:T_GREATER, nil], [:T_GREATER, nil],
[:T_IDENT, 'y'] [:T_IDENT, 'y']
] ])
end end
it 'lexes the expression "x>y"' do it 'lexes the expression "x>y"' do
lex_css('x>y').should == lex_css('x > y') expect(lex_css('x>y')).to eq(lex_css('x > y'))
end end
it 'lexes the + axis' do it 'lexes the + axis' do
lex_css('+').should == [[:T_PLUS, nil]] expect(lex_css('+')).to eq([[:T_PLUS, nil]])
end end
it 'lexes the expression "+ y"' do it 'lexes the expression "+ y"' do
lex_css('+ y').should == [[:T_PLUS, nil], [:T_IDENT, 'y']] expect(lex_css('+ y')).to eq([[:T_PLUS, nil], [:T_IDENT, 'y']])
end end
it 'lexes the expression "x + y"' do it 'lexes the expression "x + y"' do
lex_css('x + y').should == [ expect(lex_css('x + y')).to eq([
[:T_IDENT, 'x'], [:T_IDENT, 'x'],
[:T_PLUS, nil], [:T_PLUS, nil],
[:T_IDENT, 'y'] [:T_IDENT, 'y']
] ])
end end
it 'lexes the expression "x+y"' do it 'lexes the expression "x+y"' do
lex_css('x+y').should == lex_css('x + y') expect(lex_css('x+y')).to eq(lex_css('x + y'))
end end
it 'lexes the ~ axis' do it 'lexes the ~ axis' do
lex_css('~').should == [[:T_TILDE, nil]] expect(lex_css('~')).to eq([[:T_TILDE, nil]])
end end
it 'lexes the expression "~ y"' do it 'lexes the expression "~ y"' do
lex_css('~ y').should == [[:T_TILDE, nil], [:T_IDENT, 'y']] expect(lex_css('~ y')).to eq([[:T_TILDE, nil], [:T_IDENT, 'y']])
end end
it 'lexes the expression "x ~ y"' do it 'lexes the expression "x ~ y"' do
lex_css('x ~ y').should == [ expect(lex_css('x ~ y')).to eq([
[:T_IDENT, 'x'], [:T_IDENT, 'x'],
[:T_TILDE, nil], [:T_TILDE, nil],
[:T_IDENT, 'y'] [:T_IDENT, 'y']
] ])
end end
it 'lexes the expression "x~y"' do it 'lexes the expression "x~y"' do
lex_css('x~y').should == lex_css('x ~ y') expect(lex_css('x~y')).to eq(lex_css('x ~ y'))
end end
end end
end end

View File

@ -3,19 +3,19 @@ require 'spec_helper'
describe Oga::CSS::Lexer do describe Oga::CSS::Lexer do
describe 'namespaces' do describe 'namespaces' do
it 'lexes a path containing a namespace name' do it 'lexes a path containing a namespace name' do
lex_css('foo|bar').should == [ expect(lex_css('foo|bar')).to eq([
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_PIPE, nil], [:T_PIPE, nil],
[:T_IDENT, 'bar'] [:T_IDENT, 'bar']
] ])
end end
it 'lexes a path containing a namespace wildcard' do it 'lexes a path containing a namespace wildcard' do
lex_css('*|foo').should == [ expect(lex_css('*|foo')).to eq([
[:T_IDENT, '*'], [:T_IDENT, '*'],
[:T_PIPE, nil], [:T_PIPE, nil],
[:T_IDENT, 'foo'] [:T_IDENT, 'foo']
] ])
end end
end end
end end

View File

@ -3,87 +3,87 @@ require 'spec_helper'
describe Oga::CSS::Lexer do describe Oga::CSS::Lexer do
describe 'operators' do describe 'operators' do
it 'lexes the = operator' do it 'lexes the = operator' do
lex_css('[foo="bar"]').should == [ expect(lex_css('[foo="bar"]')).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_EQ, nil], [:T_EQ, nil],
[:T_STRING, 'bar'], [:T_STRING, 'bar'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
it 'lexes the ~= operator' do it 'lexes the ~= operator' do
lex_css('[foo~="bar"]').should == [ expect(lex_css('[foo~="bar"]')).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_SPACE_IN, nil], [:T_SPACE_IN, nil],
[:T_STRING, 'bar'], [:T_STRING, 'bar'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
it 'lexes the ^= operator' do it 'lexes the ^= operator' do
lex_css('[foo^="bar"]').should == [ expect(lex_css('[foo^="bar"]')).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_STARTS_WITH, nil], [:T_STARTS_WITH, nil],
[:T_STRING, 'bar'], [:T_STRING, 'bar'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
it 'lexes the $= operator' do it 'lexes the $= operator' do
lex_css('[foo$="bar"]').should == [ expect(lex_css('[foo$="bar"]')).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_ENDS_WITH, nil], [:T_ENDS_WITH, nil],
[:T_STRING, 'bar'], [:T_STRING, 'bar'],
[:T_RBRACK, nil], [:T_RBRACK, nil],
] ])
end end
it 'lexes the *= operator' do it 'lexes the *= operator' do
lex_css('[foo*="bar"]').should == [ expect(lex_css('[foo*="bar"]')).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_IN, nil], [:T_IN, nil],
[:T_STRING, 'bar'], [:T_STRING, 'bar'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
it 'lexes the |= operator' do it 'lexes the |= operator' do
lex_css('[foo|="bar"]').should == [ expect(lex_css('[foo|="bar"]')).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_HYPHEN_IN, nil], [:T_HYPHEN_IN, nil],
[:T_STRING, 'bar'], [:T_STRING, 'bar'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
it 'lexes the = operator surrounded by whitespace' do it 'lexes the = operator surrounded by whitespace' do
lex_css('[foo = "bar"]').should == lex_css('[foo="bar"]') expect(lex_css('[foo = "bar"]')).to eq(lex_css('[foo="bar"]'))
end end
it 'lexes the ~= operator surrounded by whitespace' do it 'lexes the ~= operator surrounded by whitespace' do
lex_css('[foo ~= "bar"]').should == lex_css('[foo~="bar"]') expect(lex_css('[foo ~= "bar"]')).to eq(lex_css('[foo~="bar"]'))
end end
it 'lexes the ^= operator surrounded by whitespace' do it 'lexes the ^= operator surrounded by whitespace' do
lex_css('[foo ^= "bar"]').should == lex_css('[foo^="bar"]') expect(lex_css('[foo ^= "bar"]')).to eq(lex_css('[foo^="bar"]'))
end end
it 'lexes the $= operator surrounded by whitespace' do it 'lexes the $= operator surrounded by whitespace' do
lex_css('[foo $= "bar"]').should == lex_css('[foo$="bar"]') expect(lex_css('[foo $= "bar"]')).to eq(lex_css('[foo$="bar"]'))
end end
it 'lexes the *= operator surrounded by whitespace' do it 'lexes the *= operator surrounded by whitespace' do
lex_css('[foo *= "bar"]').should == lex_css('[foo*="bar"]') expect(lex_css('[foo *= "bar"]')).to eq(lex_css('[foo*="bar"]'))
end end
it 'lexes the |= operator surrounded by whitespace' do it 'lexes the |= operator surrounded by whitespace' do
lex_css('[foo |= "bar"]').should == lex_css('[foo|="bar"]') expect(lex_css('[foo |= "bar"]')).to eq(lex_css('[foo|="bar"]'))
end end
end end
end end

View File

@ -5,73 +5,73 @@ require 'spec_helper'
describe Oga::CSS::Lexer do describe Oga::CSS::Lexer do
describe 'paths' do describe 'paths' do
it 'lexes a simple path' do it 'lexes a simple path' do
lex_css('h3').should == [[:T_IDENT, 'h3']] expect(lex_css('h3')).to eq([[:T_IDENT, 'h3']])
end end
it 'lexes a path with Unicode characters' do it 'lexes a path with Unicode characters' do
lex_css('áâã').should == [[:T_IDENT, 'áâã']] expect(lex_css('áâã')).to eq([[:T_IDENT, 'áâã']])
end end
it 'lexes a path with Unicode and ASCII characters' do it 'lexes a path with Unicode and ASCII characters' do
lex_css('áâãfoo').should == [[:T_IDENT, 'áâãfoo']] expect(lex_css('áâãfoo')).to eq([[:T_IDENT, 'áâãfoo']])
end end
it 'lexes a simple path starting with an underscore' do it 'lexes a simple path starting with an underscore' do
lex_css('_h3').should == [[:T_IDENT, '_h3']] expect(lex_css('_h3')).to eq([[:T_IDENT, '_h3']])
end end
it 'lexes a path with an escaped identifier' do it 'lexes a path with an escaped identifier' do
lex_css('foo\.bar\.baz').should == [[:T_IDENT, 'foo.bar.baz']] expect(lex_css('foo\.bar\.baz')).to eq([[:T_IDENT, 'foo.bar.baz']])
end end
it 'lexes a path with an escaped identifier followed by another identifier' do it 'lexes a path with an escaped identifier followed by another identifier' do
lex_css('foo\.bar baz').should == [ expect(lex_css('foo\.bar baz')).to eq([
[:T_IDENT, 'foo.bar'], [:T_IDENT, 'foo.bar'],
[:T_SPACE, nil], [:T_SPACE, nil],
[:T_IDENT, 'baz'] [:T_IDENT, 'baz']
] ])
end end
it 'lexes a path with two members' do it 'lexes a path with two members' do
lex_css('div h3').should == [ expect(lex_css('div h3')).to eq([
[:T_IDENT, 'div'], [:T_IDENT, 'div'],
[:T_SPACE, nil], [:T_SPACE, nil],
[:T_IDENT, 'h3'] [:T_IDENT, 'h3']
] ])
end end
it 'lexes a path with two members separated by multiple spaces' do it 'lexes a path with two members separated by multiple spaces' do
lex_css('div h3').should == [ expect(lex_css('div h3')).to eq([
[:T_IDENT, 'div'], [:T_IDENT, 'div'],
[:T_SPACE, nil], [:T_SPACE, nil],
[:T_IDENT, 'h3'] [:T_IDENT, 'h3']
] ])
end end
it 'lexes two paths' do it 'lexes two paths' do
lex_css('foo, bar').should == [ expect(lex_css('foo, bar')).to eq([
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_COMMA, nil], [:T_COMMA, nil],
[:T_IDENT, 'bar'] [:T_IDENT, 'bar']
] ])
end end
it 'lexes a path selecting an ID' do it 'lexes a path selecting an ID' do
lex_css('#foo').should == [ expect(lex_css('#foo')).to eq([
[:T_HASH, nil], [:T_HASH, nil],
[:T_IDENT, 'foo'] [:T_IDENT, 'foo']
] ])
end end
it 'lexes a path selecting a class' do it 'lexes a path selecting a class' do
lex_css('.foo').should == [ expect(lex_css('.foo')).to eq([
[:T_DOT, nil], [:T_DOT, nil],
[:T_IDENT, 'foo'] [:T_IDENT, 'foo']
] ])
end end
it 'lexes a wildcard path' do it 'lexes a wildcard path' do
lex_css('*').should == [[:T_IDENT, '*']] expect(lex_css('*')).to eq([[:T_IDENT, '*']])
end end
end end
end end

View File

@ -3,12 +3,12 @@ require 'spec_helper'
describe Oga::CSS::Lexer do describe Oga::CSS::Lexer do
describe 'predicates' do describe 'predicates' do
it 'lexes a path containing a simple predicate' do it 'lexes a path containing a simple predicate' do
lex_css('foo[bar]').should == [ expect(lex_css('foo[bar]')).to eq([
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_IDENT, 'bar'], [:T_IDENT, 'bar'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
end end
end end

View File

@ -3,86 +3,86 @@ require 'spec_helper'
describe Oga::CSS::Lexer do describe Oga::CSS::Lexer do
describe 'pseudo classes' do describe 'pseudo classes' do
it 'lexes the :root pseudo class' do it 'lexes the :root pseudo class' do
lex_css(':root').should == [ expect(lex_css(':root')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'root'] [:T_IDENT, 'root']
] ])
end end
it 'lexes the :nth-child pseudo class' do it 'lexes the :nth-child pseudo class' do
lex_css(':nth-child(1)').should == [ expect(lex_css(':nth-child(1)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_INT, 1], [:T_INT, 1],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child pseudo class with extra whitespace' do it 'lexes the :nth-child pseudo class with extra whitespace' do
lex_css(':nth-child( 1)').should == [ expect(lex_css(':nth-child( 1)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_INT, 1], [:T_INT, 1],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(odd) pseudo class' do it 'lexes the :nth-child(odd) pseudo class' do
lex_css(':nth-child(odd)').should == [ expect(lex_css(':nth-child(odd)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_ODD, nil], [:T_ODD, nil],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(even) pseudo class' do it 'lexes the :nth-child(even) pseudo class' do
lex_css(':nth-child(even)').should == [ expect(lex_css(':nth-child(even)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_EVEN, nil], [:T_EVEN, nil],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(n) pseudo class' do it 'lexes the :nth-child(n) pseudo class' do
lex_css(':nth-child(n)').should == [ expect(lex_css(':nth-child(n)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_NTH, nil], [:T_NTH, nil],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(-n) pseudo class' do it 'lexes the :nth-child(-n) pseudo class' do
lex_css(':nth-child(-n)').should == [ expect(lex_css(':nth-child(-n)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_MINUS, nil], [:T_MINUS, nil],
[:T_NTH, nil], [:T_NTH, nil],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(2n) pseudo class' do it 'lexes the :nth-child(2n) pseudo class' do
lex_css(':nth-child(2n)').should == [ expect(lex_css(':nth-child(2n)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_INT, 2], [:T_INT, 2],
[:T_NTH, nil], [:T_NTH, nil],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(2n+1) pseudo class' do it 'lexes the :nth-child(2n+1) pseudo class' do
lex_css(':nth-child(2n+1)').should == [ expect(lex_css(':nth-child(2n+1)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
@ -90,11 +90,11 @@ describe Oga::CSS::Lexer do
[:T_NTH, nil], [:T_NTH, nil],
[:T_INT, 1], [:T_INT, 1],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(2n-1) pseudo class' do it 'lexes the :nth-child(2n-1) pseudo class' do
lex_css(':nth-child(2n-1)').should == [ expect(lex_css(':nth-child(2n-1)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
@ -102,11 +102,11 @@ describe Oga::CSS::Lexer do
[:T_NTH, nil], [:T_NTH, nil],
[:T_INT, -1], [:T_INT, -1],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :nth-child(-2n-1) pseudo class' do it 'lexes the :nth-child(-2n-1) pseudo class' do
lex_css(':nth-child(-2n-1)').should == [ expect(lex_css(':nth-child(-2n-1)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'nth-child'], [:T_IDENT, 'nth-child'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
@ -114,28 +114,28 @@ describe Oga::CSS::Lexer do
[:T_NTH, nil], [:T_NTH, nil],
[:T_INT, -1], [:T_INT, -1],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :lang(fr) pseudo class' do it 'lexes the :lang(fr) pseudo class' do
lex_css(':lang(fr)').should == [ expect(lex_css(':lang(fr)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'lang'], [:T_IDENT, 'lang'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_IDENT, 'fr'], [:T_IDENT, 'fr'],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
it 'lexes the :not(#foo) pseudo class' do it 'lexes the :not(#foo) pseudo class' do
lex_css(':not(#foo)').should == [ expect(lex_css(':not(#foo)')).to eq([
[:T_COLON, nil], [:T_COLON, nil],
[:T_IDENT, 'not'], [:T_IDENT, 'not'],
[:T_LPAREN, nil], [:T_LPAREN, nil],
[:T_HASH, nil], [:T_HASH, nil],
[:T_IDENT, 'foo'], [:T_IDENT, 'foo'],
[:T_RPAREN, nil] [:T_RPAREN, nil]
] ])
end end
end end
end end

View File

@ -3,19 +3,19 @@ require 'spec_helper'
describe Oga::CSS::Lexer do describe Oga::CSS::Lexer do
describe 'strings' do describe 'strings' do
it 'lexes a single quoted string' do it 'lexes a single quoted string' do
lex_css("['foo']").should == [ expect(lex_css("['foo']")).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_STRING, 'foo'], [:T_STRING, 'foo'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
it 'lexes a double quoted string' do it 'lexes a double quoted string' do
lex_css('["foo"]').should == [ expect(lex_css('["foo"]')).to eq([
[:T_LBRACK, nil], [:T_LBRACK, nil],
[:T_STRING, 'foo'], [:T_STRING, 'foo'],
[:T_RBRACK, nil] [:T_RBRACK, nil]
] ])
end end
end end
end end

View File

@ -0,0 +1,7 @@
require 'spec_helper'
describe Oga::CSS::Lexer do
it 'ignores leading and trailing whitespace' do
expect(lex_css(' foo ')).to eq([[:T_IDENT, 'foo']])
end
end

View File

@ -3,72 +3,72 @@ require 'spec_helper'
describe Oga::CSS::Parser do describe Oga::CSS::Parser do
describe 'axes' do describe 'axes' do
it 'parses the > axis' do it 'parses the > axis' do
parse_css('x > y').should == parse_xpath('descendant::x/y') expect(parse_css('x > y')).to eq(parse_xpath('descendant::x/y'))
end end
it 'parses the > axis without whitespace' do it 'parses the > axis without whitespace' do
parse_css('x>y').should == parse_css('x > y') expect(parse_css('x>y')).to eq(parse_css('x > y'))
end end
it 'parses the > axis called on another > axis' do it 'parses the > axis called on another > axis' do
parse_css('a > b > c').should == parse_xpath('descendant::a/b/c') expect(parse_css('a > b > c')).to eq(parse_xpath('descendant::a/b/c'))
end end
it 'parses an > axis followed by an element with an ID' do it 'parses an > axis followed by an element with an ID' do
parse_css('x > foo#bar').should == parse_xpath( expect(parse_css('x > foo#bar')).to eq(parse_xpath(
'descendant::x/foo[@id="bar"]' 'descendant::x/foo[@id="bar"]'
) ))
end end
it 'parses an > axis followed by an element with a class' do it 'parses an > axis followed by an element with a class' do
parse_css('x > foo.bar').should == parse_xpath( expect(parse_css('x > foo.bar')).to eq(parse_xpath(
'descendant::x/foo[contains(concat(" ", @class, " "), " bar ")]' 'descendant::x/foo[contains(concat(" ", @class, " "), " bar ")]'
) ))
end end
it 'parses the + axis' do it 'parses the + axis' do
parse_css('x + y').should == parse_xpath( expect(parse_css('x + y')).to eq(parse_xpath(
'descendant::x/following-sibling::*[1]/self::y' 'descendant::x/following-sibling::*[1]/self::y'
) ))
end end
it 'parses the + axis without whitespace' do it 'parses the + axis without whitespace' do
parse_css('x+y').should == parse_css('x + y') expect(parse_css('x+y')).to eq(parse_css('x + y'))
end end
it 'parses the + axis with an identifier only at the right-hand side' do it 'parses the + axis with an identifier only at the right-hand side' do
parse_css('+ y').should == parse_xpath( expect(parse_css('+ y')).to eq(parse_xpath(
'following-sibling::*[1]/self::y' 'following-sibling::*[1]/self::y'
) ))
end end
it 'parses the + axis called on another + axis' do it 'parses the + axis called on another + axis' do
parse_css('a + b + c').should == parse_xpath( expect(parse_css('a + b + c')).to eq(parse_xpath(
'descendant::a/following-sibling::*[1]/self::b/' \ 'descendant::a/following-sibling::*[1]/self::b/' \
'following-sibling::*[1]/self::c' 'following-sibling::*[1]/self::c'
) ))
end end
it 'parses the ~ axis' do it 'parses the ~ axis' do
parse_css('x ~ y').should == parse_xpath( expect(parse_css('x ~ y')).to eq(parse_xpath(
'descendant::x/following-sibling::y' 'descendant::x/following-sibling::y'
) ))
end end
it 'parses the ~ axis without whitespace' do it 'parses the ~ axis without whitespace' do
parse_css('x~y').should == parse_css('x ~ y') expect(parse_css('x~y')).to eq(parse_css('x ~ y'))
end end
it 'parses the ~ axis followed by another node test' do it 'parses the ~ axis followed by another node test' do
parse_css('x ~ y z').should == parse_xpath( expect(parse_css('x ~ y z')).to eq(parse_xpath(
'descendant::x/following-sibling::y/descendant::z' 'descendant::x/following-sibling::y/descendant::z'
) ))
end end
it 'parses the ~ axis called on another ~ axis' do it 'parses the ~ axis called on another ~ axis' do
parse_css('a ~ b ~ c').should == parse_xpath( expect(parse_css('a ~ b ~ c')).to eq(parse_xpath(
'descendant::a/following-sibling::b/following-sibling::c' 'descendant::a/following-sibling::b/following-sibling::c'
) ))
end end
end end
end end

View File

@ -7,21 +7,21 @@ describe Oga::CSS::Parser do
end end
it 'parses an expression' do it 'parses an expression' do
described_class.parse_with_cache('foo') expect(described_class.parse_with_cache('foo'))
.should == s(:axis, 'descendant', s(:test, nil, 'foo')) .to eq(s(:axis, 'descendant', s(:test, nil, 'foo')))
end end
it 'caches an expression after parsing it' do it 'caches an expression after parsing it' do
described_class.any_instance expect_any_instance_of(described_class)
.should_receive(:parse) .to receive(:parse)
.once .once
.and_call_original .and_call_original
described_class.parse_with_cache('foo') expect(described_class.parse_with_cache('foo'))
.should == s(:axis, 'descendant', s(:test, nil, 'foo')) .to eq(s(:axis, 'descendant', s(:test, nil, 'foo')))
described_class.parse_with_cache('foo') expect(described_class.parse_with_cache('foo'))
.should == s(:axis, 'descendant', s(:test, nil, 'foo')) .to eq(s(:axis, 'descendant', s(:test, nil, 'foo')))
end end
end end
end end

View File

@ -3,29 +3,29 @@ require 'spec_helper'
describe Oga::CSS::Parser do describe Oga::CSS::Parser do
describe 'classes' do describe 'classes' do
it 'parses a class selector' do it 'parses a class selector' do
parse_css('.foo').should == parse_xpath( expect(parse_css('.foo')).to eq(parse_xpath(
'descendant::*[contains(concat(" ", @class, " "), " foo ")]' 'descendant::*[contains(concat(" ", @class, " "), " foo ")]'
) ))
end end
it 'parses a selector for an element with a class' do it 'parses a selector for an element with a class' do
parse_css('foo.bar').should == parse_xpath( expect(parse_css('foo.bar')).to eq(parse_xpath(
'descendant::foo[contains(concat(" ", @class, " "), " bar ")]' 'descendant::foo[contains(concat(" ", @class, " "), " bar ")]'
) ))
end end
it 'parses a selector using multiple classes' do it 'parses a selector using multiple classes' do
parse_css('.foo.bar').should == parse_xpath( expect(parse_css('.foo.bar')).to eq(parse_xpath(
'descendant::*[contains(concat(" ", @class, " "), " foo ") ' \ 'descendant::*[contains(concat(" ", @class, " "), " foo ") ' \
'and contains(concat(" ", @class, " "), " bar ")]' 'and contains(concat(" ", @class, " "), " bar ")]'
) ))
end end
it 'parses a selector using a class and an ID' do it 'parses a selector using a class and an ID' do
parse_css('#foo.bar').should == parse_xpath( expect(parse_css('#foo.bar')).to eq(parse_xpath(
'descendant::*[@id="foo" and ' \ 'descendant::*[@id="foo" and ' \
'contains(concat(" ", @class, " "), " bar ")]' 'contains(concat(" ", @class, " "), " bar ")]'
) ))
end end
end end
end end

View File

@ -3,24 +3,24 @@ require 'spec_helper'
describe Oga::CSS::Parser do describe Oga::CSS::Parser do
describe 'IDs' do describe 'IDs' do
it 'parses an ID selector' do it 'parses an ID selector' do
parse_css('#foo').should == parse_xpath('descendant::*[@id="foo"]') expect(parse_css('#foo')).to eq(parse_xpath('descendant::*[@id="foo"]'))
end end
it 'parses a selector for an element with an ID' do it 'parses a selector for an element with an ID' do
parse_css('foo#bar').should == parse_xpath('descendant::foo[@id="bar"]') expect(parse_css('foo#bar')).to eq(parse_xpath('descendant::foo[@id="bar"]'))
end end
it 'parses a selector using multiple IDs' do it 'parses a selector using multiple IDs' do
parse_css('#foo#bar').should == parse_xpath( expect(parse_css('#foo#bar')).to eq(parse_xpath(
'descendant::*[@id="foo" and @id="bar"]' 'descendant::*[@id="foo" and @id="bar"]'
) ))
end end
it 'parses a selector using an ID and a class' do it 'parses a selector using an ID and a class' do
parse_css('.foo#bar').should == parse_xpath( expect(parse_css('.foo#bar')).to eq(parse_xpath(
'descendant::*[contains(concat(" ", @class, " "), " foo ") ' \ 'descendant::*[contains(concat(" ", @class, " "), " foo ") ' \
'and @id="bar"]' 'and @id="bar"]'
) ))
end end
end end
end end

Some files were not shown because too many files have changed in this diff Show More