# Migrating From Nokogiri

If you're parsing XML/HTML documents using Ruby, chances are you're using
[Nokogiri][nokogiri] for this. This guide aims to make it easier to switch from
Nokogiri to Oga.

## Parsing Documents

In Nokogiri there are two defacto ways of parsing documents:

* `Nokogiri.XML()` for XML documents
* `Nokogiri.HTML()` for HTML documents

For example, to parse an XML document you'd use the following:

    Nokogiri::XML('<root>foo</root>')

Oga instead uses the following two methods:

* `Oga.parse_xml`
* `Oga.parse_html`

Their usage is similar:

    Oga.parse_xml('<root>foo</root>')

Nokogiri returns two distinctive document classes based on what method was used
to parse a document:

* `Nokogiri::XML::Document` for XML documents
* `Nokogiri::HTML::Document` for HTML documents

Oga on the other hand always returns `Oga::XML::Document` instance, Oga
currently makes no distinction between XML and HTML documents other than on
lexer level. This might change in the future if deemed required.

## Querying Documents

Nokogiri allows one to query documents/elements using both XPath expressions and
CSS selectors. In Nokogiri one queries a document as following:

    document = Nokogiri::XML('<root><foo>bar</foo></root>')

    document.xpath('root/foo')
    document.css('root foo')

Oga currently only supports XPath expressions, CSS selectors will be added in
the near future. Querying documents works similar to Nokogiri:

    document = Oga.parse_xml('<root><foo>bar</foo></root>')

    document.xpath('root/foo')

Nokogiri also allows you to query a document and return the first match, opposed
to an entire node set, using the method `at`. In Nokogiri this method can be
used for both XPath expression and CSS selectors. Oga has no such method,
instead it provides the following more dedicated methods:

* `at_xpath`: returns the first node of an XPath expression

For example:

    document = Oga.parse_xml('<root><foo>bar</foo></root>')

    document.at_xpath('root/foo')

By using a dedicated method Oga doesn't have to try and guess what type of
expression you're using (XPath or CSS), meaning it can never make any mistakes.

## Retrieving Attribute Values

Nokogiri provides two methods for retrieving attributes and attribute values:

* `Nokogiri::XML::Node#attribute`
* `Nokogiri::XML::Node#attr`

The first method always returns an instance of `Nokogiri::XML::Attribute`, the
second method returns the attribute value as a `String`. This behaviour,
especially due to the names used, is extremely confusing.

Oga on the other hand provides the following two methods:

* `Oga::XML::Element#attribute` (aliased as `attr`)
* `Oga::XML::Element#get`

The first method always returns a `Oga::XML::Attribute` instance, the second
returns the attribute value as a `String`. I deliberately chose `get` for
getting a value to remove the confusion of `attribute` vs `attr`. This also
allows for `attr` to simply be an alias of `attribute`.

As an example, this is how you'd get the value of a `class` attribute in
Nokogiri:

    document = Nokogiri::XML('<root class="foo"></root>')

    document.xpath('root').first.attr('class') # => "foo"

This is how you'd get the same value in Oga:

    document = Oga.parse_xml('<root class="foo"></root>')

    document.xpath('root').first.get('class') # => "foo"

## Modifying Documents

Modifying documents in Nokogiri is not as convenient as it perhaps could be. For
example, adding an element to a document is done as following:

    document = Nokogiri::XML('<root></root>')
    root     = document.xpath('root').first

    name = Nokogiri::XML::Element.new('name', document)

    name.inner_html = 'Alice'

    root.add_child(name)

The annoying part here is that we have to pass a document into an Element's
constructor. As such, you can not create elements without first creating a
document. Another thing is that Nokogiri has no method called `inner_text=`,
instead you have to use the method `inner_html=`.

In Oga you'd use the following:

    document = Oga.parse_xml('<root></root>')
    root     = document.xpath('root').first

    name = Oga::XML::Element.new(:name => 'name')

    name.inner_text = 'Alice'

    root.children << name

Adding attributes works similar for both Nokogiri and Oga. For Nokogiri you'd
use the following:

    element.set_attribute('class', 'foo')

Alternatively you can do the following:

    element['class'] = 'foo'

In Oga you'd instead use the method `set`:

    element.set('class', 'foo')

This method automatically creates an attribute if it doesn't exist, including
the namespace if specified:

    element.set('foo:class', 'foo')

## Serializing Documents

Serializing the document back to XML works the same in both libraries, simply
call `to_xml` on a document or element and you'll get a String back containing
the XML. There is one key difference here though: Nokogiri does not return the
exact same output as it was given as input, for example it adds XML declaration
tags:

    Nokogiri::XML('<root></root>').to_xml # => "<?xml version=\"1.0\"?>\n<root/>\n"

Oga on the other hand does not do this:

    Oga.parse_xml('<root></root>').to_xml # => "<root></root>"

Oga also doesn't insert random newlines or other possibly unexpected (or
unwanted) data.

[nokogiri]: http://nokogiri.org/