From c4a5e8d4b4d71fef5e71919395c72a0e58b1988e Mon Sep 17 00:00:00 2001 From: Yorick Peterse Date: Mon, 17 Nov 2014 23:23:13 +0100 Subject: [PATCH] Updated the changelog. --- doc/changelog.md | 114 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 111 insertions(+), 3 deletions(-) diff --git a/doc/changelog.md b/doc/changelog.md index 52d5166..fb3ee06 100644 --- a/doc/changelog.md +++ b/doc/changelog.md @@ -3,21 +3,129 @@ This document contains details of the various releases and their release dates. Dates are in the format `yyyy-mm-dd`. -## 0.2.0 - Unreleased +## 0.2.0 - 2014-11-17 + +### CSS Selector Support + +Probably the biggest feature of this release: support for querying documents +using CSS selectors. Oga supports a subset of the CSS3 selector specification, +in particular the following selectors are supported: + +* Element, class and ID selectors +* Attribute selectors (e.g. `foo[x ~= "y"]`) + +The following pseudo classes are supported: + +* `:root` +* `:nth-child(n)` +* `:nth-last-child(n)` +* `:nth-of-type(n)` +* `:nth-last-of-type(n)` +* `:first-child` +* `:last-child` +* `:first-of-type` +* `:last-of-type` +* `:only-child` +* `:only-of-type` +* `:empty` + +You can use CSS selectors using the methods `css` and `at_css` on an instance of +`Oga::XML::Document` or `Oga::XML::Element`. For example: + + document = Oga.parse_xml('Alice') + + document.css('people person') # => NodeSet(Element(name: "person" ...)) + +The architecture behind this is quite similar to parsing XPath. There's a lexer +(`Oga::CSS::Lexer`) and a parser (`Oga::CSS::Parser`). Unlike Nokogiri (and +perhaps other libraries) the parser _does not_ output XPath expressions as a +String or a CSS specific AST. Instead it directly emits an XPath AST. This +allows the resulting AST to be directly evaluated by `Oga::XPath::Evaluator`. + +### Mutli-line Attribute Support + +Oga can now lex/parse elements that have attributes with newlines in them. +Previously this would trigger memory allocation errors. + +See for more information. + +### SAX after_element + +The `after_element` method in the SAX parsing API now always takes two +arguments: the namespace name and element name. Previously this method would +always receive a single nil value as its argument, which is rather pointless. + +See for more information. + +### XPath Grouping + +XPath expressions can now be grouped together using parenthesis. This allows one +to specify a custom operator precedence. + +### Enumerator Parsing Input + +Enumerator instances can now be used as input for `Oga.parse_xml` and friends. +This can be used to download and parse XML files on the fly. For example: + + enum = Enumerator.new do |yielder| + HTTPClient.get('http://some-website.com/some-big-file.xml') do |chunk| + yielder << chunk + end + end + + document = Oga.parse_xml(enum) + +See for more information. + +### Removing Attributes + +Element attributes can now be removed using `Oga::XML::Element#unset`: + + element = Oga::XML::Element.new(:name => 'foo') + + element.set('class', 'foo') + element.unset('class') + +### XPath Attributes + +XPath predicates are now evaluated for every context node opposed to being +evaluated once for the entire context. This ensures that expressions such as +`descendant-or-self::node()/foo[1]` are evaluated correctly. + +### Available Namespaces + +When calling `Oga::XML::Element#available_namespaces` the Hash returned by +`Oga::XML::Element#namespaces` would be modified in place. This was a bug that +has been fixed in this release. + +### NodeSets + +NodeSet instances can now be compared with each other using `==`. Previously +this would always consider two instances to be different from each other due to +the usage of the default `Object#==` method. + +### XML Entities XML entities such as `&` and `<` are now encoded/decoded by the lexer, -string and text nodes. See for -more information. +string and text nodes. + +See for more information. + +### General Source lines are no longer included in error messages generated by the XML parser. This simplifies the code and removes the need of re-reading the input (in case of IO/Enumerable inputs). +### XML Lexer Newlines + Newlines in the XML lexer are now counted in native code (C/Java). On MRI and JRuby the improvement is quite small, but on Rubinius it's a massive improvement. See commit `8db77c0a09bf6c996dd2856a6dbe1ad076b1d30a` for more information. +### HTML Void Element Performance + Performance for detecting HTML void elements (e.g. `
` and ``) has been improved by removing String allocations that were not needed.