Gets an
IHtmlDocument
by parsing the content of an IDocument
.
- Namespace
- Statiq
.Common - Containing Type
- I
Document Html Extensions
Syntax
public static Task<IHtmlDocument> ParseHtmlAsync(this IDocument document, bool clone = true)
Remarks
The default
HtmlParser
has HtmlParserOptions.IsNotConsumingCharacterReferences
set to true
so that character references are not decoded when parsing (which is important for passing
encoded characters like @
through to engines like Razor that require them to be encoded if literal).
This has the unfortunate side effect of triggering double-encoding on serialization,
see https://github.com/AngleSharp/AngleSharp/issues/396#issuecomment-246106539.
To avoid that, use StatiqMarkupFormatter
or one of the extensions from
IMarkupFormattableExtensions
or IElementExtensions
whenever
serialization needs to be performed from a IHtmlDocument
obtained from this method.
Parameters
Name | Type | Description |
---|---|---|
document | IDocument | The document to parse. |
clone | bool |
Set to true if potentially modifying the result, false if only using as read-only.
When true the resulting HTML document (found in the cache or parsed) is cloned before returning.
If the HTML document is cloned, use IExecutionContext.GetContentProvider(IHtmlDocument) to
get an updated content provider for the mutated HTML document and update the internal HTML document cache
with the new content provider and HTML content.
|
Return Value
Type | Description |
---|---|
Task |
The parsed HTML document. |