core/oga - oga

Commit Graph

Author	SHA1	Message	Date
Yorick Peterse	e11b9ed32c	Basic XPath parser setup.	2014-06-01 23:02:28 +02:00
Yorick Peterse	54de2df0c7	Support for lexing XPath wildcard expressions. To support this we need to require whitespace around the "*" operator. This is not ideal but it will do for now.	2014-06-01 23:01:24 +02:00
Yorick Peterse	48bf1a0628	Tweak Gemspec file list a bit. This ensures it also takes files such as "Rakefile" into account when needed.	2014-06-01 22:16:55 +02:00
Yorick Peterse	b7855c124f	XPath lexer test for predicates using axes.	2014-06-01 19:49:37 +02:00
Yorick Peterse	88b06e247d	Added a few more XPath lexer tests.	2014-06-01 19:43:36 +02:00
Yorick Peterse	8dd8d7a519	Basic working XPath lexer. This doesn't lex everything of the XPath specification just yet and needs more tests.	2014-06-01 19:24:35 +02:00
Yorick Peterse	a50b76a2d8	Cleaned up XPath lexer boilerplate a bit.	2014-05-29 19:25:49 +02:00
Yorick Peterse	e0b07332d9	Boilerplate for the XPath lexer.	2014-05-29 19:25:49 +02:00
Yorick Peterse	be3f8fb494	Removed the on_newline XML lexer callback.	2014-05-29 14:21:48 +02:00
Yorick Peterse	ead5c71fee	Cleaned up the XML parser grammar. This resolves all shift/reduce and reduce/reduce conflicts that were previously present.	2014-05-29 01:37:19 +02:00
Yorick Peterse	49780e2b04	Fix for useless XML parser rules. Something tells me that using : and \| in your syntax might not be the best decision.	2014-05-28 21:36:06 +02:00
Yorick Peterse	28edc7726f	Rewind IO input upon resetting the lexer.	2014-05-26 00:33:20 +02:00
Yorick Peterse	c81c6db74e	Benchmarks/profilers for IO inputs in the lexer.	2014-05-26 00:31:15 +02:00
Yorick Peterse	629dcd3fe6	Support for IO inputs in the lexer. Using IO/StringIO objects one can parse large XML files without first having to read the entire file into memory. This can potentially save a lot of memory at the cost of a slightly slower runtime. For IO like instances the lexer will consume the input line by line. If a String is given it's consumed as a whole instead. A small side effect of reading the input line by line is that text such as "foo\nbar" will be lexed as two tokens instead of one. Fixes #19.	2014-05-26 00:30:39 +02:00
Yorick Peterse	6b9d65923a	Use a method for getting input in the XML lexer. Instead of directly accessing the `data` instance variable the C/Java code now uses the method `read_data`. This is part of one of the various steps required to allow Oga to read data from IO like instances. It also means I can freely change the name of the instance variable without also having to change the C/Java code.	2014-05-21 00:27:23 +02:00
Yorick Peterse	418b4ef498	Cleaned up documentation of the XML lexer.	2014-05-21 00:21:21 +02:00
Yorick Peterse	3a8582030d	Removed remaining fhold call in the XML lexer. There's no particular need any more for this fhold call so we're getting rid of it.	2014-05-21 00:11:39 +02:00
Yorick Peterse	4542f06d0f	Replaced fcall/fret with fnext in the XML lexer. With the rules being cleaned up/moved around a bit we can drop the use of fcall/fret. This saves the need of having to maintain a stack (position).	2014-05-21 00:08:48 +02:00
Yorick Peterse	c56b0395e4	Moved various rules around for the XML lexer. This moves the element related rules to the element_head machine (where they belong). This in turn makes it possible to lex ">" as a text node, previously this was impossible.	2014-05-21 00:04:53 +02:00
Yorick Peterse	feaf28d423	Remove dedicated string machine in the XML lexer. This removes the need for another fcall/fret combination.	2014-05-19 20:26:07 +02:00
Yorick Peterse	93b9718406	Cleaned up the XML lexer documentation.	2014-05-19 09:39:35 +02:00
Yorick Peterse	cd0f3380c4	Merge multiple CDATA tokens into a single token. The tokens T_CDATA_START, T_TEXT and T_CDATA_END have been merged together into T_CDATA.	2014-05-19 09:36:19 +02:00
Yorick Peterse	a4fb5c1299	Merge multiple comment tokens into a single one. The tokens T_COMMENT_START, T_TEXT and T_COMMENT_END have been merged into a single token: T_COMMENT. This simplifies both the lexer and the parser.	2014-05-19 09:30:30 +02:00
Yorick Peterse	c891dd88cb	Removed useless code from the XML parser.	2014-05-18 23:30:26 +02:00
Yorick Peterse	31ec76c90a	Fixed guard in the lexer header.	2014-05-18 16:51:17 +02:00
Yorick Peterse	81a81f0ab0	Don't create Arrays when not needed.	2014-05-16 17:05:42 +02:00
Yorick Peterse	854936f30b	Added average benchmarks for the parser.	2014-05-16 16:38:27 +02:00
Yorick Peterse	ad67cd708f	Only include debug info when DEBUG is set.	2014-05-15 20:43:48 +02:00
Yorick Peterse	fd2f727183	Only set explicit ivars in the lexer.	2014-05-15 19:48:18 +02:00
Yorick Peterse	44bf1dd1ca	Split up handling of element names/namespaces. This is now split up on Ragel level, simplifying the corresponding Ruby code.	2014-05-15 10:22:05 +02:00
Yorick Peterse	723a273e4f	Enforce symbols for element attributes. This comes with a little bit of memory overhead but this should be minor in most cases.	2014-05-15 01:04:26 +02:00
Yorick Peterse	f4b9bbd4ac	Removed lazy way of setting instance variables. This process is quite a bit slower compared to setting instance variables directly.	2014-05-15 00:43:13 +02:00
Yorick Peterse	043ea9a366	Fall back to ps in the profiler. If the /proc filesystem doesn't exist we'll fall back to using the `ps` shell command.	2014-05-11 21:15:33 +02:00
Yorick Peterse	1b58723e7d	Removed stdioh. #include. This header is also not needed.	2014-05-11 21:06:55 +02:00
Yorick Peterse	e2b9fc75ca	Removed #include for malloc.h Apparently some OS' move this to malloc/malloc.h. Since it's not needed lets just get rid of it.	2014-05-11 21:06:02 +02:00
Yorick Peterse	ba3d96c819	Re-build lexers when base_lexer.rl changes. Thanks to @avdi for bringing up on how to do this when using rule() blocks.	2014-05-10 00:28:23 +02:00
Yorick Peterse	19f04f98f7	Support for lexing/parsing inline doctypes.	2014-05-10 00:28:11 +02:00
Yorick Peterse	a92023fe94	Removed outdated paragraph from the README. Ironically Oga now uses native extensions for the lexer.	2014-05-09 00:34:25 +02:00
Yorick Peterse	a8bf6be00e	Added a contributing guide.	2014-05-09 00:32:44 +02:00
Yorick Peterse	2dd5d996c4	Travis: don't notify for every failure.	2014-05-08 10:20:35 +02:00
Yorick Peterse	c472ceac6f	Docs for the shared Ragel grammar.	2014-05-08 00:21:23 +02:00
Yorick Peterse	98db796205	Updated editor configuration.	2014-05-08 00:17:12 +02:00
Yorick Peterse	51c1f3c32d	Updated the README.	2014-05-08 00:15:54 +02:00
Yorick Peterse	fe74d60138	Manually bootstrap JRuby after all. After discussing this with @headius I've decided to do this the manual way anyway. Apparently the basic load service stuff is deprecated and not very reliable.	2014-05-07 22:32:34 +02:00
Yorick Peterse	90fabe3f21	Compile when running `rake generate`.	2014-05-07 20:07:31 +02:00
Yorick Peterse	3c621bf22e	Removed the manifest file + task. Using a Dir.glob() is much easier when dealing with a bunch of generated files.	2014-05-07 11:11:29 +02:00
Yorick Peterse	ee78b2c382	Don't redefine namespaces in C. The Oga::XML namespace should be set up by Ruby, not by C.	2014-05-07 10:52:06 +02:00
Yorick Peterse	bbdc7966db	Documentation for the JRuby extension.	2014-05-07 10:24:24 +02:00
Yorick Peterse	3afef5f7cc	Lexer support for JRuby. JRuby now passes all tests. Benchmark wise it completes the big XML benchmark in about 500-600 milliseconds.	2014-05-07 09:40:22 +02:00
Yorick Peterse	b9a4038e42	Callback boilerplate for the Java lexer.	2014-05-07 01:01:24 +02:00

... 18 19 20 21 22 ...

1205 Commits All Branches Search

1205 Commits

All Branches