This is a fork of html5lib to parse XML documents. For a pure HTML5 parser, please use html5lib instead.

Differences to html5lib

Basically, html5plus is amlost exactly the same as html5lib, except it is also able to parse simple XML documents:

  • Like XML, self-closing tags, such as <div/>, are handled as the leaf nodes (this is the only reason this fork exists).

For example,


will be interpreted as follows in html5plus.


On the other hand, htm5lib and many browsers will interpret it as follows:

  • Support processing instructions (a pull request was sent to html5lib).
  • HtmlParser has an additional flag called cdataOK. It controls whether CDATA is always accepted, including the namespace.
  • Support the line number information (Node.lineNumber).
  • Notice that it is not available in Text node and it broke the compatibility with dart:html.


Add this to your pubspec.yaml (or create it):

  html5plus: any


###Parsing HTML is easy!

import 'package:html5plus/parser.dart' show parse;
import 'package:html5plus/dom.dart';

main() {
  var document = parse(
      '<body>Hello world! <a href="">HTML5 rocks!');

###Parsing XML

import 'package:html5plus/parser.dart' show parse;
import 'package:html5plus/dom.dart';

main() {
  var document = new HtmlParser(lowercaseElementName: false, 
    lowercaseAttrName: false, cdataOK: true)
      <!process this>
      <foo>Hello world! <important>XML rocks!</important>
        <![CDATA here & there ]]>

  for (final node in document.nodes)



A simple tree API that results from parsing html. Intended to be compatible with dart:html, but right now it resembles the classic JS DOM.


This library contains extra APIs that aren't in the DOM, but are useful when interacting with the parse tree.


This library has a parser for HTML5 documents, that lets you parse HTML easily from a script or server side application:


This library adds dart:io support to the HTML5 parser. Call initDartIOSupport before calling the parse methods and they will accept a RandomAccessFile as input, in addition to the other input types.