The Java Developers Almanac 1.4


Order this book from Amazon.

   
Home > List of Packages > org.w3c.dom  [30 examples] > Getting Nodes  [5 examples]

e529. Getting the Declared Entities in a DOM Document

The entities declared in the DTD of an XML document are available in the DOM document. Unfortunately, with J2SE 1.4's default parser, only the names, not the values, are available. In order to obtain the values, you must parse the file without expanding entity references and then scan the DOM document for the unexpanded entity references. The unexpanded entity references contain the values.

Note: By default, a parser expands entity references while constructing the DOM tree. See e516 Preventing Expansion of Entity References While Parsing an XML File to prevent expansion. The default parser in J2SE 1.4 expands entity references in attribute values. There is no way to prevent this.

    // Obtain a document; this method is implemented in
    // e516 Preventing Expansion of Entity References While Parsing an XML File
    Document doc = parseXmlFileNoExpandER("infilename.xml", true);
    
    // Scan the document for entity references and get their values.
    // The values are stored in the map using the entity name as the key.
    Map entityValues = new HashMap();
    getEntityValues(doc, entityValues);
    
    // Get list of declared entities
    NamedNodeMap entities = doc.getDoctype().getEntities();
    for (int i=0; i<entities.getLength(); i++) {
        Entity entity = (Entity)entities.item(i);
        String entityName = entity.getNodeName();
        String entityPublicId = entity.getPublicId();
        String entitySystemId = entity.getSystemId();
    
        // Get the value of the entity, which is its set of child nodes
        Node entityValue = (Node)entityValues.get(entityName);
    }
    
    // This method walks the document looking for entity references.
    // When one is found, this method adds the entity reference node
    // to `map' using the name as the key.
    public static void getEntityValues(Node node, Map map) {
        if (node instanceof EntityReference) {
            map.put(node.getNodeName(), node);
        }
    
        // Visit the children
        NodeList list = node.getChildNodes();
        for (int i=0; i<list.getLength(); i++) {
            getEntityValues(list.item(i), map);
        }
    }
This is the sample input for the example:
    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE root [
        <!ENTITY entity1 "an internal entity">
    
        <!-- This is a parameter entity; it will not appear in the DOM document -->
        <!ENTITY % entity2 "a | b">
    
        <!-- This is an external text entity -->
        <!ENTITY entity3 SYSTEM "External.xml">
    
        <!-- This is an external parameter entity; it includes DTD from
             another file. It will not appear in the DOM document. However,
             any non-parameter entities declared in the file will be included. -->
        <!ENTITY % entity4 SYSTEM "More.dtd">
        %entity4;
    
        <!-- entity5 is an unparsed entity; it can only appear as an attribute value -->
        <!ENTITY entity5 SYSTEM "pic.jpg" NDATA NOTA1>
        <!NOTATION NOTA1 SYSTEM "jpgviewer.exe">
        <!ELEMENT elem2 EMPTY>
        <!ATTLIST elem2 attr ENTITY #REQUIRED>
    ]>
    <root a="&entity1;">
        &entity1;
        &entity3;
        &ent1;
        &ent2;
        <elem2 attr="entity5"/>
    </root>
External.xml:
    <!-- a file with XML markup -->
    <i>external</i> text
More.dtd:
    <!-- a file with more DTD declarations -->
    <!ENTITY ent1 "xx">
    <!ENTITY ent2 "yy">
    <!ELEMENT elem1 (%entity2;)>
The following lists the entities that would appear in the DOM document. Their values are also listed. Notice the parameter entities entity2 and entity4 do not appear in the list.
    entity1=an internal entity
    entity3=<i>external</i> text
    ent1=xx
    ent2=yy
    ent5=null
If the input file were parsed with entity expansion, the resulting XML would be:
    <?xml version="1.0" encoding="UTF-8"?>
    <root a="an internal entity">
        an internal entity
        <!-- a file with XML markup -->
    <i>external</i> text
    
        xx
        yy
        <elem2 attr="entity5"/>
    </root>
Note: The J2SE 1.4 DOM writing routines don't appear to write entity references properly. In particular, only text nodes that are descendants of the entity reference are written; all other node types are simply not printed.

 Related Examples
e1073. Getting the Root Element in a DOM Document
e527. Getting a Node Relative to Another Node in a DOM Document
e528. Getting the Notations in a DOM Document
e530. Getting the Value of an Entity Reference in a DOM Document

See also: Adding and Removing Nodes    Element Attributes    Elements    Text Nodes    XPath   


© 2002 Addison-Wesley.