View this PageEdit this PageUploads to this PageVersions of this PageHomeRecent ChangesSearchHelp Guide

libxml2 cheat sheet

Contenido de la página http://hamburgsteak.sandwich.net/writ/libxml2.txt

Developing XML-enabled C programs with libxml2
A beginner's guide
By David Turover
This document is in the public domain.

For brevity's sake, the code in this document contains no error checking.
In real life, you will want to check for NULL pointers and function returns.


Introduction

libxml2 is a library of functions for handling XML data.


A simple example:

#include <stdio.h>
#include <libxml/tree.h>

int main(){
        xmlDocPtr doc;
        xmlNodePtr nodeLevel1;
        xmlNodePtr nodeLevel2;

        doc = xmlParseFile("xmlfile.xml");
        for(    nodeLevel1 = doc->children;
                nodeLevel1 != NULL;
                nodeLevel1 = nodeLevel1->next)
        {
                printf("%s\n",nodeLevel1->name);
                for(    nodeLevel2 = nodeLevel1->children;
                        nodeLevel2 != NULL;
                        nodeLevel2 = nodeLevel2->next)
                {
                        printf("\t%s\n",nodeLevel2->name);
                }
        }
        xmlSaveFile("xmlfile_copy.xml", doc);
        xmlFreeDoc(doc);
        return 0;
}


The above code, compiled with -lxml2, should print out the names of the
elements in the first two elements' depth of an XML file, and save a copy
of the file.


Explanation of Introduction

The xmlDocPtr is a pointer to an xmlDoc structure.
It represents an XML data source.

You load an XML file with the xmlParseFile() function, which takes
as a parameter the name of an XML file and returns a pointer to a
new xmlDoc structure (or NULL on failure). When done, you release
this memory with the xmlFreeDoc() function. You can export an xmlDocPtr's
data as an XML file with the xmlSaveFile() function.

The xmlNodePtr points to a single element or node of an XML document.
Each xmlNode has a .children member which is an xmlNodePtr to the first
of this node's children. Each xmlNode has a .name member which is a
string containing the name of the element it represents, or the word "text"
for a text node.

The xmlNodePtr is the basic structure used to traverse an XML document
with libxml2. It contains several xmlNodePtrs which can be used to move
around the document. If there is no other node in a particular direction,
the pointer is NULL.

xmlNodePtr->children    The first child of the node
xmlNodePtr->last        The node's last child
xmlNodePtr->parent      The current node's parent node

xmlNodePtr->next        The next sibling node
xmlNodePtr->prev        The previous sibling node

xmlNodePtr->doc         The xmlDocPtr for the document containing this node


<node>
        <node> ->parent
                <node>
                        <node> ->prev
                        </node>
                        <node>
                        </node>
                </node>
                <node> You Are Here
                        <node> ->children
                        </node>
                        <node>
                        </node>
                        <node> ->last
                        </node>
                </node>
                <node> ->next
                        <node>
                        </node>
                </node>
        </node>
</node>

Checking for text nodes

You can easily check to see what type of xmlNode you have by looking
at the xmlNodePtr->type member, which is an integer with one of the
following values:

        XML_ELEMENT_NODE
        XML_ATTRIBUTE_NODE
        XML_TEXT_NODE
        XML_CDATA_SECTION_NODE
        XML_ENTITY_REF_NODE
        XML_ENTITY_NODE
        XML_PI_NODE
        XML_COMMENT_NODE
        XML_DOCUMENT_NODE
        XML_DOCUMENT_TYPE_NODE
        XML_DOCUMENT_FRAG_NODE
        XML_NOTATION_NODE
        XML_HTML_DOCUMENT_NODE
        XML_DTD_NODE
        XML_ELEMENT_DECL
        XML_ATTRIBUTE_DECL
        XML_ENTITY_DECL
        XML_NAMESPACE_DECL
        XML_XINCLUDE_START
        XML_XINCLUDE_END
        XML_DOCB_DOCUMENT_NODE

The only ones you need to care about right now are XML_TEXT_NODE
and XML_ELEMENT_NODE.


Handling a Node

An XML node generally looks like this:

<this_is_a_node attribute1="abcdefg" attribute2="12345">
        <this_is_a_child_node>Hello World</this_is_a_child_node>
</this_is_a_node>

The things you can manipulate are the node itself, the node's attributes,
and the node's contents.

Attributes

Working with attributes of a node is fairly straightforward: You use the
xmlGetProp() function to get an attribute's value and the xmlSetProp() 
function to change an attribute's value. If you want to know if an
attribute exists, you use the xmlHasProp() function. If you want to
completely remove an attribute, use xmlUnsetProp().

xmlSetProp(xmlNodePtr node, xmlChar *name, xmlChar *value);
xmlGetProp(xmlNodePtr node, xmlChar *name);
xmlHasProp(xmlNodePtr node, xmlChar *name);
xmlUnsetProp(xmlNodePtr node, xmlChar *name);

xmlGetProp returns a string that must be freed with the xmlFree() function
when you are done with it, or else your program will have a memory leak.


Content

Working with content is less intuitive. The content of a node is not simply
what a node contains, but is the text of a node and its children with the
elements stripped and removed. Thus the content of <this_is_a_node> from the
above example would be "Hello World", with the child element
<this_is_a_child_node> nowhere to be seen. If you try adding element tags
to a node's content, libxml2 will &escape their < and > characters.

To work with content, then, you use the xmlNodeSetContent() and
xmlNodeGetContent() functions to set or retrieve a node's content,
or the xmlNodeAddContent() function to append to a node's content.

xmlNodeSetContent(xmlNodePtr node, xmlChar *content);
xmlNodeAddContent(xmlNodePtr node, xmlChar *content);
xmlNodeGetContent(xmlNodePtr node);

As with xmlGetProp(), you must use xmlFree() on the result
of xmlNodeGetContent() or else you will have a memory leak.


To print everything an element contains, not simply its content,
use xmlElemDump()

xmlElemDump(FILE * output, xmlDocPtr doc, xmlNodePtr node);




Strings: xmlChar* versus char*

xmlChar* is the string type used by libxml2.
You can easily cast between char* and xmlChar*.



Creating a New Node


To create a node from scratch and add it to a document:

xmlNodePtr node = xmlNewNode(NULL, "name");
xmlNodePtr nodeParent = doc->children;
node = xmlDocCopyNode(node, doc, 1);
xmlAddChild(nodeParent, node);

The xmlNewNode() function allocates memory for a new node. When you are done,
you must free the node with xmlFree() unless the node has been added to
another structure (as it has here) which will be freed. The NULL
in xmlNewNode() is where an xmlNsPtr namespace pointer would be if the node
was going to be assigned to a particular namespace; we are not using namespaces
right now, so it is left as NULL.

The xmlDocCopyNode() function does not copy the node to the target document.
Instead, it only copies the document information to the node, so that the
node believes it is part of the document. To add the node to the document,
you must then use another function such as xmlAddChild(), xmlAddSibling(),
xmlAddNextSibling(), or xmlAddPrevSibling().


Summary of xmlNode Members and Simple Interface Functions

type            Node type (usually XML_ELEMENT_NODE or XML_ELEMENT_TEXT)
name            String containing element's name, or "text" if a text node
children        First child of node
last            Last child of node
parent          Parent node
next            Next sibling node
prev            Previous sibling node
doc             The document containing this node


xmlSetProp(xmlNodePtr node, const xmlChar *name, const xmlChar *value);
xmlGetProp(xmlNodePtr node, const xmlChar *name);
xmlHasProp(xmlNodePtr node, const xmlChar *name);
xmlUnsetProp(xmlNodePtr node, const xmlChar *name);

xmlNodeSetContent(xmlNodePtr cur, const xmlChar *content);
xmlNodeAddContent(xmlNodePtr cur, const xmlChar *content);
xmlNodeGetContent(xmlNodePtr cur);

xmlElemDump(FILE * output, xmlDocPtr doc, xmlNodePtr node);

For more information, read the API docs at:
http://xmlsoft.org/html/libxml-tree.html