oga/doc/changelog.md

146 lines
4.5 KiB
Markdown
Raw Normal View History

2014-02-26 18:50:16 +00:00
# Changelog
2014-09-11 21:41:46 +00:00
This document contains details of the various releases and their release dates.
Dates are in the format `yyyy-mm-dd`.
2014-09-28 20:00:41 +00:00
## 0.2.0 - Unreleased
XML entities such as `&` and `<` are now encoded/decoded by the lexer,
string and text nodes. See <https://github.com/YorickPeterse/oga/issues/49> for
more information.
Source lines are no longer included in error messages generated by the XML
parser. This simplifies the code and removes the need of re-reading the input
(in case of IO/Enumerable inputs).
Newlines in the XML lexer are now counted in native code (C/Java). On MRI and
JRuby the improvement is quite small, but on Rubinius it's a massive
improvement. See commit `8db77c0a09bf6c996dd2856a6dbe1ad076b1d30a` for more
information.
Performance for detecting HTML void elements (e.g. `<br>` and `<link>`) has been
improved by removing String allocations that were not needed.
2014-09-23 22:24:00 +00:00
## 0.1.3 - 2014-09-24
This release fixes a problem with serializing attributes using the namespace
prefix "xmlns". See <https://github.com/YorickPeterse/oga/issues/47> for more
information.
2014-09-23 14:18:50 +00:00
## 0.1.2 - 2014-09-23
### SAX API
2014-09-16 12:49:42 +00:00
A SAX parser/API has been added. This API is useful when even the overhead of
the pull-parser is too much memory wise. Example:
class ElementNames
attr_reader :names
def initialize
@names = []
end
def on_element(namespace, name, attrs = {})
@names << name
end
end
handler = ElementNames.new
Oga.sax_parse_xml(handler, '<foo><bar></bar></foo>')
handler.names # => ["foo", "bar"]
### Racc Gem
2014-09-16 12:49:42 +00:00
Oga will now always use the Racc gem instead of the version shipped with the
Ruby standard library.
### Error Reporting
2014-09-16 12:49:42 +00:00
XML parser errors have been made a little bit more user friendly, though they
can still be quite cryptic.
### Serializing Elements
2014-09-16 12:49:42 +00:00
Elements serialized to XML/HTML will use self-closing tags whenever possible.
When parsing HTML documents only HTML void elements will use self-closing tags
(e.g. `<link>` tags). Example:
Oga.parse_xml('<foo></foo>').to_xml # => "<foo />"
Oga.parse_html('<script></script>').to_xml # => "<script></script>"
### Default Namespaces
2014-09-16 12:49:42 +00:00
Namespaces are no longer removed from the attributes list when an element is
created.
Default XML namespaces can now be registered using `xmlns="..."`. Previously
this would be ignored. Example:
document = Oga.parse_xml('<root xmlns="baz"></root>')
root = document.children[0]
2014-09-16 12:49:42 +00:00
root.namespace # => Namespace(name: "xmlns" uri: "baz")
### Lexing Incomplete Input
Oga can now lex input such as `</` without entering an infinite loop. Example:
Oga.parse_xml('</') # => Document(children: NodeSet(Text("</")))
### Absolute XPath Paths
2014-09-16 12:49:42 +00:00
Oga can now parse and evaluate the XPath expression "/" (that is, just "/").
This will return the root node (usually a Document instance). Example:
document = Oga.parse_xml('<root></root>')
document.xpath('/') # => NodeSet(Document(children: NodeSet(Element(name: "root"))))
### Namespace Ordering
2014-09-16 12:49:42 +00:00
Namespaces available to an element are now returned in the correct order.
Previously outer namespaces would take precedence over inner namespaces, instead
of it being the other way around. Example:
document = Oga.parse_xml <<-EOF
<root xmlns:foo="bar">
<container xmlns:foo="baz">
<foo:text>Text!</foo:text>
</container>
</root>
EOF
foo = document.at_xpath('root/container/foo:text')
foo.namespace # => Namespace(name: "foo" uri: "baz")
### Parsing Capitalized HTML Void Elements
2014-09-16 12:49:42 +00:00
Oga is now capable of parsing capitalized HTML void elements (e.g. `<BR>`).
Previously it could only parse lower-cased void elements. Thanks to Tero Tasanen
for fixing this. Example:
Oga.parse_html('<BR>') # => Document(children: NodeSet(Element(name: "BR")))
### Node Type Method Removed
2014-09-16 12:49:42 +00:00
The `node_type` method has been removed and its purpose has been moved into
the `XML::PullParser` class itself. This method was solely used by the pull
parser to provide shorthands for node classes. As such it doesn't make sense to
expose this as a method to the outside world as a public method.
2014-09-13 09:50:30 +00:00
## 0.1.1 - 2014-09-13
This release fixes a problem where element attributes were not separated by
spaces. Thanks to Jonathan Rochkind for reporting it and Bill Dueber providing
an initial patch for this problem.
2014-09-11 21:41:46 +00:00
## 0.1.0 - 2014-09-12
The first public release of Oga. This release contains support for parsing XML,
basic support for parsing HTML, support for querying documents using XPath and
more.