xmlhtml-0.1.7: XML parser and renderer with HTML 5 quirks mode

Safe HaskellSafe-Infered

Text.XmlHtml.HTML.Parse

Synopsis

Documentation

docFragment :: Encoding -> Parser DocumentSource

HTML version of document fragment parsing rule It differs only in that it parses the HTML version of content and returns an HtmlDocument.

docTypeDecl :: Parser DocTypeSource

Internal subset is parsed, but ignored since we don't have data types to store it.

data ElemResult Source

When parsing an element, three things can happen (besides failure):

  1. The end tag matches the start tag. This is a Matched.
  2. The end tag does not match, but the element has an end tag that can be omitted when there is no more content in its parent. This is an ImplicitLast. In this case, we need to remember the tag name of the end tag that we did find, so as to match it later.
  3. A start tag is found such that it implicitly ends the current element. This is an ImplicitNext. In this case, we parse and remember the entire element that comes next, so that it can be inserted after the element being parsed.

isControlChar :: Char -> BoolSource

From 8.2.2.3 of the HTML 5 spec, omitting the very high control characters because they are unlikely to occur and I got tired of typing.