Roadmap for API Changes

overhaul serialize/pretty printing API

overhaul and optimize the SAX parsing

  • see fairy wing throwdown - SAX parsing is wicked slow.

Node should not be Enumerable; and should have a better attributes API

improve CSS query parsing


Better Syntax for custom XPath function handler

Better Syntax around Node#xpath and NodeSet#xpath

  • look at those methods, and use of Node#extract_params in Node#{css,search}

  • we should standardize on a hash of options for these and other calls

  • what should NodeSet#xpath return?



We have a lot of issues open around encoding. How bad are things? Somebody who knows encoding well should head this up.


It's fundamentally broken, in that we can't stop people from crashing their application if they want to use object reference unsafely.

Class methods that require Document

There are a few methods, like that require a Document object.

We should probably make Document instance methods to wrap this, since it's a non-obvious expectation and thus fails as a convention.

So, instead, let's make alternative methods like Nokogiri::XML::Document#new_comment, and recommend those as the proper convention.

collect_namespaces is just broken

collect_namespaces is returning a hash, which means it can't return namespaces with the same prefix. See this issue for background:

Do we care? This seems like a useless method, but then again I hate XML, so what do I know?