Package org.jsoup.helper
Class W3CDom
- java.lang.Object
-
- org.jsoup.helper.W3CDom
-
public class W3CDom extends java.lang.Object
Helper class to transform aDocument
to aorg.w3c.dom.Document
, for integration with toolsets that use the W3C DOM.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
W3CDom.W3CBuilder
Implements the conversion by walking the input.
-
Field Summary
Fields Modifier and Type Field Description private static java.lang.String
ContextNodeProperty
private static java.lang.String
ContextProperty
protected javax.xml.parsers.DocumentBuilderFactory
factory
private boolean
namespaceAware
static java.lang.String
SourceProperty
For W3C Documents created by this class, this property is set on each node to link back to the original jsoup node.static java.lang.String
XPathFactoryProperty
To get support for XPath versions > 1, set this property to the classname of an alternate XPathFactory implementation.
-
Constructor Summary
Constructors Constructor Description W3CDom()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
asString(org.w3c.dom.Document doc)
Serialize a W3C document to a String.static java.lang.String
asString(org.w3c.dom.Document doc, java.util.Map<java.lang.String,java.lang.String> properties)
Serialize a W3C document to a String.org.w3c.dom.Node
contextNode(org.w3c.dom.Document wDoc)
For a Document created byfromJsoup(org.jsoup.nodes.Element)
, retrieves the W3C context node.static org.w3c.dom.Document
convert(Document in)
Converts a jsoup DOM to a W3C DOM.void
convert(Document in, org.w3c.dom.Document out)
Converts a jsoup document into the provided W3C Document.void
convert(Element in, org.w3c.dom.Document out)
Converts a jsoup element into the provided W3C Document.org.w3c.dom.Document
fromJsoup(Document in)
Convert a jsoup Document to a W3C Document.org.w3c.dom.Document
fromJsoup(Element in)
Convert a jsoup DOM to a W3C Document.private static java.util.HashMap<java.lang.String,java.lang.String>
methodMap(java.lang.String method)
boolean
namespaceAware()
Returns if this W3C DOM is namespace aware.W3CDom
namespaceAware(boolean namespaceAware)
Update the namespace aware setting.static java.util.HashMap<java.lang.String,java.lang.String>
OutputHtml()
Canned default for HTML output.static java.util.HashMap<java.lang.String,java.lang.String>
OutputXml()
Canned default for XML output.(package private) static java.util.Properties
propertiesFromMap(java.util.Map<java.lang.String,java.lang.String> map)
org.w3c.dom.NodeList
selectXpath(java.lang.String xpath, org.w3c.dom.Document doc)
Evaluate an XPath query against the supplied document, and return the results.org.w3c.dom.NodeList
selectXpath(java.lang.String xpath, org.w3c.dom.Node contextNode)
Evaluate an XPath query against the supplied context node, and return the results.<T extends Node>
java.util.List<T>sourceNodes(org.w3c.dom.NodeList nodeList, java.lang.Class<T> nodeType)
Retrieves the original jsoup DOM nodes from a nodelist created by this convertor.
-
-
-
Field Detail
-
SourceProperty
public static final java.lang.String SourceProperty
For W3C Documents created by this class, this property is set on each node to link back to the original jsoup node.- See Also:
- Constant Field Values
-
ContextProperty
private static final java.lang.String ContextProperty
- See Also:
- Constant Field Values
-
ContextNodeProperty
private static final java.lang.String ContextNodeProperty
- See Also:
- Constant Field Values
-
XPathFactoryProperty
public static final java.lang.String XPathFactoryProperty
To get support for XPath versions > 1, set this property to the classname of an alternate XPathFactory implementation. (For e.g.net.sf.saxon.xpath.XPathFactoryImpl
).- See Also:
- Constant Field Values
-
factory
protected javax.xml.parsers.DocumentBuilderFactory factory
-
namespaceAware
private boolean namespaceAware
-
-
Method Detail
-
namespaceAware
public boolean namespaceAware()
Returns if this W3C DOM is namespace aware. By default, this will betrue
, but is disabled for simplicity when using XPath selectors inElement.selectXpath(String)
.- Returns:
- the current namespace aware setting.
-
namespaceAware
public W3CDom namespaceAware(boolean namespaceAware)
Update the namespace aware setting. This impacts the factory that is used to create W3C nodes from jsoup nodes.For HTML documents, controls if the document will be in the default
.http://www.w3.org/1999/xhtml
namespace if otherwise unset.- Parameters:
namespaceAware
- the updated setting- Returns:
- this W3CDom, for chaining.
-
convert
public static org.w3c.dom.Document convert(Document in)
Converts a jsoup DOM to a W3C DOM.- Parameters:
in
- jsoup Document- Returns:
- W3C Document
-
asString
public static java.lang.String asString(org.w3c.dom.Document doc, java.util.Map<java.lang.String,java.lang.String> properties)
Serialize a W3C document to a String. Provide Properties to define output settings including if HTML or XML. If you don't provide the properties (null
), the output will be auto-detected based on the content of the document.- Parameters:
doc
- Documentproperties
- (optional/nullable) the output properties to use. SeeTransformer.setOutputProperties(Properties)
andOutputKeys
- Returns:
- Document as string
- See Also:
OutputHtml()
,OutputXml()
,OutputKeys.ENCODING
,OutputKeys.OMIT_XML_DECLARATION
,OutputKeys.STANDALONE
,OutputKeys.STANDALONE
,OutputKeys.DOCTYPE_PUBLIC
,OutputKeys.CDATA_SECTION_ELEMENTS
,OutputKeys.INDENT
,OutputKeys.MEDIA_TYPE
-
propertiesFromMap
static java.util.Properties propertiesFromMap(java.util.Map<java.lang.String,java.lang.String> map)
-
OutputHtml
public static java.util.HashMap<java.lang.String,java.lang.String> OutputHtml()
Canned default for HTML output.
-
OutputXml
public static java.util.HashMap<java.lang.String,java.lang.String> OutputXml()
Canned default for XML output.
-
methodMap
private static java.util.HashMap<java.lang.String,java.lang.String> methodMap(java.lang.String method)
-
fromJsoup
public org.w3c.dom.Document fromJsoup(Document in)
Convert a jsoup Document to a W3C Document. The created nodes will link back to the original jsoup nodes in the user propertySourceProperty
(but after conversion, changes on one side will not flow to the other).- Parameters:
in
- jsoup doc- Returns:
- a W3C DOM Document representing the jsoup Document or Element contents.
-
fromJsoup
public org.w3c.dom.Document fromJsoup(Element in)
Convert a jsoup DOM to a W3C Document. The created nodes will link back to the original jsoup nodes in the user propertySourceProperty
(but after conversion, changes on one side will not flow to the other). The input Element is used as a context node, but the whole surrounding jsoup Document is converted. (If you just want a subtree converted, useconvert(org.jsoup.nodes.Element, Document)
.)- Parameters:
in
- jsoup element or doc- Returns:
- a W3C DOM Document representing the jsoup Document or Element contents.
- See Also:
sourceNodes(NodeList, Class)
,contextNode(Document)
-
convert
public void convert(Document in, org.w3c.dom.Document out)
Converts a jsoup document into the provided W3C Document. If required, you can set options on the output document before converting.- Parameters:
in
- jsoup docout
- w3c doc- See Also:
fromJsoup(org.jsoup.nodes.Element)
-
convert
public void convert(Element in, org.w3c.dom.Document out)
Converts a jsoup element into the provided W3C Document. If required, you can set options on the output document before converting.- Parameters:
in
- jsoup elementout
- w3c doc- See Also:
fromJsoup(org.jsoup.nodes.Element)
-
selectXpath
public org.w3c.dom.NodeList selectXpath(java.lang.String xpath, org.w3c.dom.Document doc)
Evaluate an XPath query against the supplied document, and return the results.- Parameters:
xpath
- an XPath querydoc
- the document to evaluate against- Returns:
- the matches nodes
-
selectXpath
public org.w3c.dom.NodeList selectXpath(java.lang.String xpath, org.w3c.dom.Node contextNode)
Evaluate an XPath query against the supplied context node, and return the results.- Parameters:
xpath
- an XPath querycontextNode
- the context node to evaluate against- Returns:
- the matches nodes
-
sourceNodes
public <T extends Node> java.util.List<T> sourceNodes(org.w3c.dom.NodeList nodeList, java.lang.Class<T> nodeType)
Retrieves the original jsoup DOM nodes from a nodelist created by this convertor.- Type Parameters:
T
- node type- Parameters:
nodeList
- the W3C nodes to get the original jsoup nodes fromnodeType
- the jsoup node type to retrieve (e.g. Element, DataNode, etc)- Returns:
- a list of the original nodes
-
contextNode
public org.w3c.dom.Node contextNode(org.w3c.dom.Document wDoc)
For a Document created byfromJsoup(org.jsoup.nodes.Element)
, retrieves the W3C context node.- Parameters:
wDoc
- Document created by this class- Returns:
- the corresponding W3C Node to the jsoup Element that was used as the creating context.
-
asString
public java.lang.String asString(org.w3c.dom.Document doc)
Serialize a W3C document to a String. The output format will be XML or HTML depending on the content of the doc.- Parameters:
doc
- Document- Returns:
- Document as string
- See Also:
asString(Document, Map)
-
-