XML(2)                                                     XML(2)

     NAME
          xmlattr, xmlcalloc, xmlelem, xmlfind, xmlfree, xmllook,
          xmlmalloc, xmlnew, xmlparse, xmlprint, xmlstrdup, xmlvalue -
          XML parser

     SYNOPSIS
          #include <u.h>
          #include <libc.h>
          #include <xml.h>

          enum {
               Fcrushwhite = 1,
               Fstripnamespace = 2,
          };

          struct Xml{
               Elem *root;         /* root of tree */
               char *doctype;      /* DOCTYPE structured comment, or nil */
               ...
          };

          struct Elem {
               Elem *next;         /* next element at this hierarchy level */
               Elem *child;        /* first child of this node */
               Elem *parent;       /* parent of this node */
               Attr *attrs;        /* linked list of atributes */
               char *name;         /* element name */
               char *pcdata;       /* pcdata following this element */
               int line;           /* Line number (for errors) */
          };

          struct Attr {
               Attr *next;         /* next atribute */
               Elem *parent;       /* parent element */
               char *name;         /* atributes name */
               char *value;        /* atributes value */
          };

          Attr* xmlattr(Xml *xp, Attr **root, Elem *parent,
                    char *name, char *value)
          Elem* xmlelem(Xml *xp, Elem **root, Elem *parent, char *name)
          Elem* xmlfind(Xml *xp, Elem *ep, char *path)
          Elem* xmllook(Elem *ep, char *path, char *attr, char *value)
          Xml*  xmlnew(int blksize)
          Xml*  xmlparse(int fd, int blksize, int flags)
          char* xmlvalue(Elem *ep, char *name)
          void* xmlmalloc(Xml *xp, usize size)
          void* xmlcalloc(Xml *xp, usize nelem, usize elemsz)
          void* xmlstrdup(Xml *xp, char *s)
          void  xmlfree(Xml *xp)

     Page 1                       Plan 9             (printed 3/28/24)

     XML(2)                                                     XML(2)

          void  xmlprint(Xml *xp, int fd)

     DESCRIPTION
          Libxml is a library for manipulating an XML document, in-
          memory (known as the DOM model). Each element may have a
          number of children, each of which has a number of
          attributes, each attribute has a single value. All elements
          contain a pointer to their parent element, the root element
          having a nil parent pointer.  Pcdata (free form text) found
          between elements is attached to element which follows it.
          The line numbers where each element was found is stored to
          allow unambigious error messages during later analysis.

          Strings are stored in two data structures: a binary tree for
          common names such as element and attribute names. Uncommon
          names such as values and pcdata are stored in a simple,
          unmanaged heap. These steps vastly reduce the memory foot-
          print of the parsed file and the time needed to free the XML
          data.

          Xmlparse reads the given file and builds an in-memory tree.
          Blocksize controls the granularity of allocation of the
          string heap described above, 8192 is typically used.  The
          flags field allows some control over the parser, it is a
          bitwise or of the following values:

          Fcrushwhite
                    All strings whitespace in PCdata is replaced by a
                    single space and leading and trailing whitespace
                    is removed.

          Fstripsnamespace
                    Remove leading namespace strings form all element
                    and attribute names; this effectively ignores
                    namespaces which can lead to parsing ambiguities,
                    though in practice it has not been a problem—yet.

          Xml trees may also be built up by calling xmlnew to create
          the XML tree, followed by xmlelem and xmlattr to create
          individual elements and attributes  respectively.  Xmlelem
          takes the address of the root of an element list to which
          the new element should be appended, the address of the par-
          ent node the new element should reference, and the name of
          the node to create; It returns the address of the created
          element.

          Xmlattr attaches an attribute to an existing element. It
          takes a list pointer and parent pointer like xmlelem, but
          requires both an atribute name and value, and returns the
          address of the new attribute.

          Xmllook descends through the tree rooted at ep using the

     Page 2                       Plan 9             (printed 3/28/24)

     XML(2)                                                     XML(2)

          path specified in path. It then returns if elem is nil, or
          continues to search for a matching element.  if attr and
          value are not nil, the search will continue for for an ele-
          ment which contains this attribute and value pair.

          Xmlvalue searches the given element's attribute list and
          returns the value of the attribute found or nil if that
          attribute is not found.

          Xmlprint writes the XML hierarchy rooted at ep as text to
          the given file descriptor.

          Xmlmalloc, xmlcalloc, and xmlstrdup allocate memory within
          the Xml tree.  Xmlfree frees all memory used by the given
          Xml tree.

     SOURCE
          /sys/src/libxml

     SEE ALSO
          xb(1).

     BUGS
          Namespaces should be handled properly.

          A SAX model parser will probably be needed sometime (e.g.
          for Ebooks).

          UTF-16 headers should be respected but UTF-16 files seems
          rare.

     Page 3                       Plan 9             (printed 3/28/24)