HTML Tidy 5.8.0
The HTACG Tidy HTML Project
 
Loading...
Searching...
No Matches
Document Tree

Detailed Description

A parsed (and optionally repaired) document is represented by Tidy as a tree, much like a W3C DOM. This tree may be traversed using these functions. The following snippet gives a basic idea how these functions can be used.

void dumpNode( TidyNode tnod, int indent ) {
TidyNode child;
for ( child = tidyGetChild(tnod); child; child = tidyGetNext(child) ) {
ctmbstr name;
switch ( tidyNodeGetType(child) ) {
case TidyNode_Root: name = "Root"; break;
case TidyNode_DocType: name = "DOCTYPE"; break;
case TidyNode_Comment: name = "Comment"; break;
case TidyNode_ProcIns: name = "Processing Instruction"; break;
case TidyNode_Text: name = "Text"; break;
case TidyNode_CDATA: name = "CDATA"; break;
case TidyNode_Section: name = "XML Section"; break;
case TidyNode_Asp: name = "ASP"; break;
case TidyNode_Jste: name = "JSTE"; break;
case TidyNode_Php: name = "PHP"; break;
case TidyNode_XmlDecl: name = "XML Declaration"; break;
default:
name = tidyNodeGetName( child );
break;
}
assert( name != NULL );
printf( "\%*.*sNode: \%s\\n", indent, indent, " ", name );
dumpNode( child, indent + 4 );
}
}
void dumpDoc( TidyDoc tdoc ) {
dumpNode( tidyGetRoot(tdoc), 0 );
}
void dumpBody( TidyDoc tdoc ) {
dumpNode( tidyGetBody(tdoc), 0 );
}
Instances of this represent a Tidy document, which encapsulates everything there is to know about a s...
Single nodes of a TidyDocument are represented by this datatype.
TidyNode TIDY_CALL tidyGetChild(TidyNode tnod)
Get the child of the indicated node.
ctmbstr TIDY_CALL tidyNodeGetName(TidyNode tnod)
Get the name of the node.
TidyNode TIDY_CALL tidyGetNext(TidyNode tnod)
Get the next sibling node.
TidyNode TIDY_CALL tidyGetBody(TidyDoc tdoc)
Get the BODY node.
TidyNodeType TIDY_CALL tidyNodeGetType(TidyNode tnod)
Get the type of node.
TidyNode TIDY_CALL tidyGetRoot(TidyDoc tdoc)
Get the root node.
@ TidyNode_End
End Tag.
Definition tidyenum.h:831
@ TidyNode_StartEnd
Start/End (empty) Tag.
Definition tidyenum.h:832
@ TidyNode_Section
XML Section.
Definition tidyenum.h:834
@ TidyNode_Asp
ASP Source.
Definition tidyenum.h:835
@ TidyNode_Start
Start Tag.
Definition tidyenum.h:830
@ TidyNode_Jste
JSTE Source.
Definition tidyenum.h:836
@ TidyNode_Php
PHP Source.
Definition tidyenum.h:837
@ TidyNode_XmlDecl
XML Declaration.
Definition tidyenum.h:838
@ TidyNode_DocType
DOCTYPE.
Definition tidyenum.h:826
@ TidyNode_Text
Text.
Definition tidyenum.h:829
@ TidyNode_Comment
Comment.
Definition tidyenum.h:827
@ TidyNode_CDATA
Unparsed Text.
Definition tidyenum.h:833
@ TidyNode_ProcIns
Processing Instruction.
Definition tidyenum.h:828
@ TidyNode_Root
Root.
Definition tidyenum.h:825
const tmbchar * ctmbstr
Definition tidyplatform.h:609

Nodes for Document Sections

TidyNode TIDY_CALL tidyGetRoot (TidyDoc tdoc)
 Get the root node.
 
TidyNode TIDY_CALL tidyGetHtml (TidyDoc tdoc)
 Get the HTML node.
 
TidyNode TIDY_CALL tidyGetHead (TidyDoc tdoc)
 Get the HEAD node.
 
TidyNode TIDY_CALL tidyGetBody (TidyDoc tdoc)
 Get the BODY node.
 

Relative Nodes

TidyNode TIDY_CALL tidyGetParent (TidyNode tnod)
 Get the parent of the indicated node.
 
TidyNode TIDY_CALL tidyGetChild (TidyNode tnod)
 Get the child of the indicated node.
 
TidyNode TIDY_CALL tidyGetNext (TidyNode tnod)
 Get the next sibling node.
 
TidyNode TIDY_CALL tidyGetPrev (TidyNode tnod)
 Get the previous sibling node.
 

Miscellaneous Node Functions

TidyNode TIDY_CALL tidyDiscardElement (TidyDoc tdoc, TidyNode tnod)
 Remove the indicated node.
 

Node Attribute Functions

TidyAttr TIDY_CALL tidyAttrFirst (TidyNode tnod)
 Get the first attribute.
 
TidyAttr TIDY_CALL tidyAttrNext (TidyAttr tattr)
 Get the next attribute.
 
ctmbstr TIDY_CALL tidyAttrName (TidyAttr tattr)
 Get the name of a TidyAttr instance.
 
ctmbstr TIDY_CALL tidyAttrValue (TidyAttr tattr)
 Get the value of a TidyAttr instance.
 
void TIDY_CALL tidyAttrDiscard (TidyDoc itdoc, TidyNode tnod, TidyAttr tattr)
 Discard an attribute.
 
TidyAttrId TIDY_CALL tidyAttrGetId (TidyAttr tattr)
 Get the attribute ID given a tidy attribute.
 
Bool TIDY_CALL tidyAttrIsEvent (TidyAttr tattr)
 Indicates whether or not a given attribute is an event attribute.
 
TidyAttr TIDY_CALL tidyAttrGetById (TidyNode tnod, TidyAttrId attId)
 Get an instance of TidyAttr by specifying an attribute ID.
 

Additional Node Interrogation

TidyNodeType TIDY_CALL tidyNodeGetType (TidyNode tnod)
 Get the type of node.
 
ctmbstr TIDY_CALL tidyNodeGetName (TidyNode tnod)
 Get the name of the node.
 
Bool TIDY_CALL tidyNodeIsText (TidyNode tnod)
 Indicates whether or not a node is a text node.
 
Bool TIDY_CALL tidyNodeIsProp (TidyDoc tdoc, TidyNode tnod)
 Indicates whether or not the node is a propriety type.
 
Bool TIDY_CALL tidyNodeIsHeader (TidyNode tnod)
 Indicates whether or not a node represents and HTML header element, such as h1, h2, etc.
 
Bool TIDY_CALL tidyNodeHasText (TidyDoc tdoc, TidyNode tnod)
 Indicates whether or not the node has text.
 
Bool TIDY_CALL tidyNodeGetText (TidyDoc tdoc, TidyNode tnod, TidyBuffer *buf)
 Gets the text of a node and places it into the given TidyBuffer.
 
Bool TIDY_CALL tidyNodeGetValue (TidyDoc tdoc, TidyNode tnod, TidyBuffer *buf)
 Get the value of the node.
 
TidyTagId TIDY_CALL tidyNodeGetId (TidyNode tnod)
 Get the tag ID of the node.
 
uint TIDY_CALL tidyNodeLine (TidyNode tnod)
 Get the line number where the node occurs.
 
uint TIDY_CALL tidyNodeColumn (TidyNode tnod)
 Get the column location of the node.
 

Function Documentation

◆ tidyAttrDiscard()

void TIDY_CALL tidyAttrDiscard ( TidyDoc itdoc,
TidyNode tnod,
TidyAttr tattr )

Discard an attribute.

Parameters
itdocThe tidy document from which to discard the attribute.
tnodThe node from which to discard the attribute.
tattrThe attribute to discard.

◆ tidyAttrFirst()

TidyAttr TIDY_CALL tidyAttrFirst ( TidyNode tnod)

Get the first attribute.

Parameters
tnodThe node for which to get attributes.
Returns
Returns an instance of TidyAttr.

◆ tidyAttrGetById()

TidyAttr TIDY_CALL tidyAttrGetById ( TidyNode tnod,
TidyAttrId attId )

Get an instance of TidyAttr by specifying an attribute ID.

Returns
Returns a TidyAttr instance.
Parameters
tnodThe node to query.
attIdThe attribute ID to find.

◆ tidyAttrGetId()

TidyAttrId TIDY_CALL tidyAttrGetId ( TidyAttr tattr)

Get the attribute ID given a tidy attribute.

Parameters
tattrThe attribute to query.
Returns
Returns the TidyAttrId of the given attribute.

◆ tidyAttrIsEvent()

Bool TIDY_CALL tidyAttrIsEvent ( TidyAttr tattr)

Indicates whether or not a given attribute is an event attribute.

Parameters
tattrThe attribute to query.
Returns
Returns a bool indicating whether or not the attribute is an event.

◆ tidyAttrName()

ctmbstr TIDY_CALL tidyAttrName ( TidyAttr tattr)

Get the name of a TidyAttr instance.

Parameters
tattrThe tidy attribute to query.
Returns
Returns a string indicating the name of the attribute.

◆ tidyAttrNext()

TidyAttr TIDY_CALL tidyAttrNext ( TidyAttr tattr)

Get the next attribute.

Parameters
tattrThe current attribute, so the next one can be returned.
Returns
Returns and instance of TidyAttr.

◆ tidyAttrValue()

ctmbstr TIDY_CALL tidyAttrValue ( TidyAttr tattr)

Get the value of a TidyAttr instance.

Parameters
tattrThe tidy attribute to query.
Returns
Returns a string indicating the value of the attribute.

◆ tidyDiscardElement()

TidyNode TIDY_CALL tidyDiscardElement ( TidyDoc tdoc,
TidyNode tnod )

Remove the indicated node.

Returns
Returns the next tidy node.
Parameters
tdocThe tidy document from which to remove the node.
tnodThe node to remove

◆ tidyGetBody()

TidyNode TIDY_CALL tidyGetBody ( TidyDoc tdoc)

Get the BODY node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyGetChild()

TidyNode TIDY_CALL tidyGetChild ( TidyNode tnod)

Get the child of the indicated node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetHead()

TidyNode TIDY_CALL tidyGetHead ( TidyDoc tdoc)

Get the HEAD node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyGetHtml()

TidyNode TIDY_CALL tidyGetHtml ( TidyDoc tdoc)

Get the HTML node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyGetNext()

TidyNode TIDY_CALL tidyGetNext ( TidyNode tnod)

Get the next sibling node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetParent()

TidyNode TIDY_CALL tidyGetParent ( TidyNode tnod)

Get the parent of the indicated node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetPrev()

TidyNode TIDY_CALL tidyGetPrev ( TidyNode tnod)

Get the previous sibling node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetRoot()

TidyNode TIDY_CALL tidyGetRoot ( TidyDoc tdoc)

Get the root node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyNodeColumn()

uint TIDY_CALL tidyNodeColumn ( TidyNode tnod)

Get the column location of the node.

Parameters
tnodThe node to query.
Returns
Returns the column location of the node.

◆ tidyNodeGetId()

TidyTagId TIDY_CALL tidyNodeGetId ( TidyNode tnod)

Get the tag ID of the node.

Parameters
tnodThe node to query.
Returns
Returns the tag ID of the node as TidyTagId.

◆ tidyNodeGetName()

ctmbstr TIDY_CALL tidyNodeGetName ( TidyNode tnod)

Get the name of the node.

Parameters
tnodThe node to query.
Returns
Returns a string indicating the name of the node.

◆ tidyNodeGetText()

Bool TIDY_CALL tidyNodeGetText ( TidyDoc tdoc,
TidyNode tnod,
TidyBuffer * buf )

Gets the text of a node and places it into the given TidyBuffer.

The text will be terminated with a TidyNewline. If you want the raw utf-8 stream see tidyNodeGetValue().

Returns
Returns a bool indicating success or not.
Parameters
tdocThe document to query.
tnodThe node to query.
[out]bufA TidyBuffer used to receive the node's text.

◆ tidyNodeGetType()

TidyNodeType TIDY_CALL tidyNodeGetType ( TidyNode tnod)

Get the type of node.

Parameters
tnodThe node to query.
Returns
Returns the type of node as TidyNodeType.

◆ tidyNodeGetValue()

Bool TIDY_CALL tidyNodeGetValue ( TidyDoc tdoc,
TidyNode tnod,
TidyBuffer * buf )

Get the value of the node.

This copies the unescaped value of this node into the given TidyBuffer at UTF-8.

Returns
Returns a bool indicating success or not.
Parameters
tdocThe document to query
tnodThe node to query
[out]bufA TidyBuffer used to receive the node's value.

◆ tidyNodeHasText()

Bool TIDY_CALL tidyNodeHasText ( TidyDoc tdoc,
TidyNode tnod )

Indicates whether or not the node has text.

Returns
Returns the type of node as TidyNodeType.
Parameters
tdocThe document to query.
tnodThe node to query.

◆ tidyNodeIsHeader()

Bool TIDY_CALL tidyNodeIsHeader ( TidyNode tnod)

Indicates whether or not a node represents and HTML header element, such as h1, h2, etc.

Parameters
tnodThe node to query.
Returns
Returns a bool indicating whether or not the node is an HTML header.

◆ tidyNodeIsProp()

Bool TIDY_CALL tidyNodeIsProp ( TidyDoc tdoc,
TidyNode tnod )

Indicates whether or not the node is a propriety type.

Returns
Returns a bool indicating whether or not the node is a proprietary type.
Parameters
tdocThe document to query.
tnodThe node to query

◆ tidyNodeIsText()

Bool TIDY_CALL tidyNodeIsText ( TidyNode tnod)

Indicates whether or not a node is a text node.

Parameters
tnodThe node to query.
Returns
Returns a bool indicating whether or not the node is a text node.

◆ tidyNodeLine()

uint TIDY_CALL tidyNodeLine ( TidyNode tnod)

Get the line number where the node occurs.

Parameters
tnodThe node to query.
Returns
Returns the line number.