oga/ext/c
Yorick Peterse 8acc7fc743 Lex CDATA tags in chunks
Instead of using a single token (T_CDATA) for a CDATA tag the lexer now
uses 3 tokens:

1. T_CDATA_START
2. T_CDATA_BODY
3. T_CDATA_END

The T_CDATA_BODY token can occur multiple times and is turned into a
single value in the XML parser. This is similar to the way strings are
lexed.

By changing the way CDATA tags are lexed Oga can now lex CDATA tags
containing newlines when using an IO as input. For example, this would
previously fail:

    Oga.parse_xml(StringIO.new("<![CDATA[\nfoo]]>"))

Because IO input reads input per line the input for the lexer would be
as following:

    "<![CDATA[\n"
    "foo]]>"

Related issues: #93
2015-04-14 22:45:55 +02:00
..
extconf.rb Use RbConfig::CONFIG['CC'] vs 'cc' 2015-03-23 19:46:44 +01:00
lexer.h Track XML C lexer state in C only. 2014-10-26 11:38:06 +01:00
lexer.rl Lex CDATA tags in chunks 2015-04-14 22:45:55 +02:00
liboga.c Don't redefine namespaces in C. 2014-05-07 10:52:06 +02:00
liboga.h Removed stdioh. #include. 2014-05-11 21:06:55 +02:00