Commit Graph

11 Commits

Author SHA1 Message Date
Yorick Peterse 629dcd3fe6 Support for IO inputs in the lexer.
Using IO/StringIO objects one can parse large XML files without first having to
read the entire file into memory. This can potentially save a lot of memory at
the cost of a slightly slower runtime.

For IO like instances the lexer will consume the input line by line. If a
String is given it's consumed as a whole instead. A small side effect of
reading the input line by line is that text such as "foo\nbar" will be lexed as
two tokens instead of one.

Fixes #19.
2014-05-26 00:30:39 +02:00
Yorick Peterse 6b9d65923a Use a method for getting input in the XML lexer.
Instead of directly accessing the `data` instance variable the C/Java code now
uses the method `read_data`. This is part of one of the various steps required
to allow Oga to read data from IO like instances. It also means I can freely
change the name of the instance variable without also having to change the
C/Java code.
2014-05-21 00:27:23 +02:00
Yorick Peterse 4542f06d0f Replaced fcall/fret with fnext in the XML lexer.
With the rules being cleaned up/moved around a bit we can drop the use of
fcall/fret. This saves the need of having to maintain a stack (position).
2014-05-21 00:08:48 +02:00
Yorick Peterse 31ec76c90a Fixed guard in the lexer header. 2014-05-18 16:51:17 +02:00
Yorick Peterse ad67cd708f Only include debug info when DEBUG is set. 2014-05-15 20:43:48 +02:00
Yorick Peterse 1b58723e7d Removed stdioh. #include.
This header is also not needed.
2014-05-11 21:06:55 +02:00
Yorick Peterse e2b9fc75ca Removed #include for malloc.h
Apparently some OS' move this to malloc/malloc.h. Since it's not needed lets
just get rid of it.
2014-05-11 21:06:02 +02:00
Yorick Peterse ee78b2c382 Don't redefine namespaces in C.
The Oga::XML namespace should be set up by Ruby, not by C.
2014-05-07 10:52:06 +02:00
Yorick Peterse e271298984 Use macros in the C lexer. 2014-05-07 00:57:25 +02:00
Yorick Peterse f25f8a3d15 Break up the Ragel C grammar.
The grammar is now broken up in to a base lexer and a C lexer. This allows the
same grammar to also be used in the Java code.
2014-05-07 00:50:34 +02:00
Yorick Peterse 9abc5c1c92 Separated the Java and C ext codebases. 2014-05-07 00:29:10 +02:00