I have announced the upcoming release of the W3C Mozilla DOM Connector in one if my previous posts, and now it has finally arrived. You can view it at
http://svn.rubyrailways.com/W3CConnector/
or check it out with svn:
svn co http://svn.rubyrailways.com/W3CConnector/
For a description about the connector, please refer to my previous post. If you would like to try it out, here is how:
#this code snippet gives you a DOM document of the currently loaded page: nsIWebBrowser brow = getWebBrowser(); nsIWebNavigation nav = (nsIWebNavigation) brow.queryInterface(nsIWebNavigation.NS_IWEBNAVIGATION_IID); nsIDOMDocument doc = (nsIDOMDocument) nav.getDocument(); Document mozDoc = (Document) org.mozilla.dom.NodeFactory.getNodeInstance(doc);
From now on, you can use all the existing java/dom libraries such as an XPath2 engine like saxon, xalan, whatever you want working on mozilla documents.
This means tremendous power compared to (in their category outstanding, but still limited) tools like RubyfulSoup or Mechanize, stemming from the power of XPath to query XML documents.
A simple example – dumping DOM of the html document to stdout:
public static void writeDOM(Node n) throws IOException { try { StreamResult sr = new StreamResult(System.out); TransformerFactory trf = TransformerFactory.newInstance(); Transformer tr = trf.newTransformer(); tr.setOutputProperty(OutputKeys.ENCODING, "UTF-8"); tr.transform(new DOMSource(n), sr); } catch (TransformerException e) { throw new IOException(); } }
Cool, isn’t it?
At the moment, I am discussing different integration issues with the Mozilla guys, since the connector should be the part of Mozilla and the Eclipse editor in the future.