org.apache.xerces.dom3.ls
Interface DOMWriter

All Known Subinterfaces:
DOMASWriter

public interface DOMWriter

DOMWriter provides an API for serializing (writing) a DOM document out in an XML document. The XML data is written to an output stream, the type of which depends on the specific language bindings in use. During serialization of XML data, namespace fixup is done when possible.

DOMWriter accepts any node type for serialization. For nodes of type Document or Entity, well formed XML will be created if possible. The serialized output for these node types is either as a Document or an External Entity, respectively, and is acceptable input for an XML parser. For all other types of nodes the serialized form is not specified, but should be something useful to a human for debugging or diagnostic purposes. Note: rigorously designing an external (source) form for stand-alone node types that don't already have one defined in seems a bit much to take on here.

Within a Document or Entity being serialized, Nodes are processed as follows Documents are written including an XML declaration and a DTD subset, if one exists in the DOM. Writing a document node serializes the entire document. Entity nodes, when written directly by writeNode defined in the DOMWriter interface, output the entity expansion but no namespace fixup is done. The resulting output will be valid as an external entity. Entity References nodes are serializes as an entity reference of the form "&entityName;") in the output. Child nodes (the expansion) of the entity reference are ignored. CDATA sections containing content characters that can not be represented in the specified output encoding are handled according to the "split-cdata-sections" feature.If the feature is true, CDATA sections are split, and the unrepresentable characters are serialized as numeric character references in ordinary content. The exact position and number of splits is not specified. If the feature is false, unrepresentable characters in a CDATA section are reported as errors. The error is not recoverable - there is no mechanism for supplying alternative characters and continuing with the serialization. All other node types (Element, Text, etc.) are serialized to their corresponding XML source form.

Within the character data of a document (outside of markup), any characters that cannot be represented directly are replaced with character references. Occurrences of '<' and '&' are replaced by the predefined entities &lt; and &amp. The other predefined entities (&gt, &apos, etc.) are not used; these characters can be included directly. Any character that can not be represented directly in the output character encoding is serialized as a numeric character reference.

Attributes not containing quotes are serialized in quotes. Attributes containing quotes but no apostrophes are serialized in apostrophes (single quotes). Attributes containing both forms of quotes are serialized in quotes, with quotes within the value represented by the predefined entity &quot;. Any character that can not be represented directly in the output character encoding is serialized as a numeric character reference.

Within markup, but outside of attributes, any occurrence of a character that cannot be represented in the output character encoding is reported as an error. An example would be serializing the element <LaCañada/> with the encoding="us-ascii".

When requested by setting the normalize-characters feature on DOMWriter, all data to be serialized, both markup and character data, is W3C Text normalized according to the rules defined in . The W3C Text normalization process affects only the data as it is being written; it does not alter the DOM's view of the document after serialization has completed.

Namespaces are fixed up during serialization, the serialization process will verify that namespace declarations, namespace prefixes and the namespace URIs associated with Elements and Attributes are consistent. If inconsistencies are found, the serialized form of the document will be altered to remove them. The algorithm used for doing the namespace fixup while seralizing a document is a combination of the algorithms used for lookupNamespaceURI and lookupNamespacePrefix . previous paragraph to be defined closer here.

Any changes made affect only the namespace prefixes and declarations appearing in the serialized data. The DOM's view of the document is not altered by the serialization operation, and does not reflect any changes made to namespace declarations or prefixes in the serialized output.

While serializing a document the serializer will write out non-specified values (such as attributes whose specified is false) if the output-default-values feature is set to true. If the output-default-values flag is set to false and the use-abstract-schema feature is set to true the abstract schema will be used to determine if a value is specified or not, if use-abstract-schema is not set the specified flag on attribute nodes is used to determine if attribute values should be written out.

Ref to Core spec (1.1.9, XML namespaces, 5th paragraph) entity ref description about warning about unbound entity refs. Entity refs are always serialized as &foo;, also mention this in the load part of this spec.

When serializing a document the DOMWriter checks to see if the document element in the document is a DOM Level 1 element or a DOM Level 2 (or higher) element (this check is done by looking at the localName of the root element). If the root element is a DOM Level 1 element then the DOMWriter will issue an error if a DOM Level 2 (or higher) element is found while serializing. Likewise if the document element is a DOM Level 2 (or higher) element and the DOMWriter sees a DOM Level 1 element an error is issued. Mixing DOM Level 1 elements with DOM Level 2 (or higher) is not supported.

DOMWriters have a number of named features that can be queried or set. The name of DOMWriter features must be valid XML names. Implementation specific features (extensions) should choose an implementation dependent prefix to avoid name collisions.

Here is a list of properties that must be recognized by all implementations.

"normalize-characters"
true
[ optional] (default) Perform the W3C Text Normalization of the characters in document as they are written out. Only the characters being written are (potentially) altered. The DOM document itself is unchanged.
false
[required] do not perform character normalization.
"split-cdata-sections"
true
[required] (default) Split CDATA sections containing the CDATA section termination marker ']]>' or characters that can not be represented in the output encoding, and output the characters using numeric character references. If a CDATA section is split a warning is issued.
false
[ required] Signal an error if a CDATASection contains an unrepresentable character.
"validation"
true
[ optional] Use the abstract schema to validate the document as it is being serialized. If validation errors are found the error handler is notified about the error. Setting this state will also set the feature use-abstract-schema to true.
false
[ required] (default) Don't validate the document as it is being serialized.
"expand-entity-references"
true
[ optional] Expand EntityReference nodes when serializing.
false
[required] (default) Serialize all EntityReference nodes as XML entity references.
"whitespace-in-element-content"
true
[required] ( default) Output all white spaces in the document.
false
[ optional] Only output white space that is not within element content. The implementation is expected to use the isWhitespaceInElementContent flag on Text nodes to determine if a text node should be written out or not.
"discard-default-content"
true
[required] (default ) Use whatever information available to the implementation (i.e. XML schema, DTD, the specified flag on Attr nodes, and so on) to decide what attributes and content should be serialized or not. Note that the specified flag on Attr nodes in itself is not always reliable, it is only reliable when it is set to false since the only case where it can be set to false is if the attribute was created by a Level 1 implementation.
false
[required] Output all attributes and all content.
"format-canonical"
true
[optional] This formatting writes the document according to the rules specified in . Setting this feature to true will set the feature "format-pretty-print" to false.
false
[required] (default) Don't canonicalize the output.
"format-pretty-print"
true
[optional] Formatting the output by adding whitespace to produce a pretty-printed, indented, human-readable form. The exact form of the transformations is not specified by this specification. Setting this feature to true will set the feature "format-canonical" to false.
false
[required] (default) Don't pretty-print the result.

See also the Document Object Model (DOM) Level 3 Abstract Schemas and Load and Save Specification.


Method Summary
 boolean canSetFeature(java.lang.String name, boolean state)
          Query whether setting a feature to a specific value is supported.
 java.lang.String getEncoding()
          The character encoding in which the output will be written.
 DOMErrorHandler getErrorHandler()
          The error handler that will receive error notifications during serialization.
 boolean getFeature(java.lang.String name)
          Look up the value of a feature.
 java.lang.String getLastEncoding()
          The actual character encoding that was last used by this formatter.
 java.lang.String getNewLine()
          The end-of-line sequence of characters to be used in the XML being written out.
 void setEncoding(java.lang.String encoding)
          The character encoding in which the output will be written.
 void setErrorHandler(DOMErrorHandler errorHandler)
          The error handler that will receive error notifications during serialization.
 void setFeature(java.lang.String name, boolean state)
          Set the state of a feature.
 void setNewLine(java.lang.String newLine)
          The end-of-line sequence of characters to be used in the XML being written out.
 boolean writeNode(java.io.OutputStream destination, org.w3c.dom.Node wnode)
          Write out the specified node as described above in the description of DOMWriter.
 java.lang.String writeToString(org.w3c.dom.Node wnode)
          Serialize the specified node as described above in the description of DOMWriter.
 

Method Detail

setFeature

public void setFeature(java.lang.String name,
                       boolean state)
                throws org.w3c.dom.DOMException
Set the state of a feature.
The feature name has the same form as a DOM hasFeature string.
It is possible for a DOMWriter to recognize a feature name but to be unable to set its value.
Parameters:
name - The feature name.
state - The requested state of the feature (true or false).
Throws:
org.w3c.dom.DOMException - Raise a NOT_SUPPORTED_ERR exception when the DOMWriter recognizes the feature name but cannot set the requested value.
Raise a NOT_FOUND_ERR When the DOMWriter does not recognize the feature name.

canSetFeature

public boolean canSetFeature(java.lang.String name,
                             boolean state)
Query whether setting a feature to a specific value is supported.
The feature name has the same form as a DOM hasFeature string.
Parameters:
name - The feature name, which is a DOM has-feature style string.
state - The requested state of the feature (true or false).
Returns:
true if the feature could be successfully set to the specified value, or false if the feature is not recognized or the requested value is not supported. The value of the feature itself is not changed.

getFeature

public boolean getFeature(java.lang.String name)
                   throws org.w3c.dom.DOMException
Look up the value of a feature.
The feature name has the same form as a DOM hasFeature string
Parameters:
name - The feature name, which is a string with DOM has-feature syntax.
Returns:
The current state of the feature (true or false).
Throws:
org.w3c.dom.DOMException - Raise a NOT_FOUND_ERR When the DOMWriter does not recognize the feature name.

getEncoding

public java.lang.String getEncoding()
The character encoding in which the output will be written.
The encoding to use when writing is determined as follows: If the encoding attribute has been set, that value will be used.If the encoding attribute is null or empty, but the item to be written includes an encoding declaration, that value will be used.If neither of the above provides an encoding name, a default encoding of "UTF-8" will be used.
The default value is null.

setEncoding

public void setEncoding(java.lang.String encoding)
The character encoding in which the output will be written.
The encoding to use when writing is determined as follows: If the encoding attribute has been set, that value will be used.If the encoding attribute is null or empty, but the item to be written includes an encoding declaration, that value will be used.If neither of the above provides an encoding name, a default encoding of "UTF-8" will be used.
The default value is null.

getLastEncoding

public java.lang.String getLastEncoding()
The actual character encoding that was last used by this formatter. This convenience method allows the encoding that was used when serializing a document to be directly obtained.

getNewLine

public java.lang.String getNewLine()
The end-of-line sequence of characters to be used in the XML being written out. The only permitted values are these:
null
Use a default end-of-line sequence. DOM implementations should choose the default to match the usual convention for text files in the environment being used. Implementations must choose a default sequence that matches one of those allowed by 2.11 "End-of-Line Handling".
CR
The carriage-return character (#xD).
CR-LF
The carriage-return and line-feed characters (#xD #xA).
LF
The line-feed character (#xA).

The default value for this attribute is null.

setNewLine

public void setNewLine(java.lang.String newLine)
The end-of-line sequence of characters to be used in the XML being written out. The only permitted values are these:
null
Use a default end-of-line sequence. DOM implementations should choose the default to match the usual convention for text files in the environment being used. Implementations must choose a default sequence that matches one of those allowed by 2.11 "End-of-Line Handling".
CR
The carriage-return character (#xD).
CR-LF
The carriage-return and line-feed characters (#xD #xA).
LF
The line-feed character (#xA).

The default value for this attribute is null.

getErrorHandler

public DOMErrorHandler getErrorHandler()
The error handler that will receive error notifications during serialization. The node where the error occured is passed to this error handler, any modification to nodes from within an error callback should be avoided since this will result in undefined, implementation dependent behavior.

setErrorHandler

public void setErrorHandler(DOMErrorHandler errorHandler)
The error handler that will receive error notifications during serialization. The node where the error occured is passed to this error handler, any modification to nodes from within an error callback should be avoided since this will result in undefined, implementation dependent behavior.

writeNode

public boolean writeNode(java.io.OutputStream destination,
                         org.w3c.dom.Node wnode)
                  throws java.lang.Exception
Write out the specified node as described above in the description of DOMWriter. Writing a Document or Entity node produces a serialized form that is well formed XML. Writing other node types produces a fragment of text in a form that is not fully defined by this document, but that should be useful to a human for debugging or diagnostic purposes.
Parameters:
destination - The destination for the data to be written.
wnode - The Document or Entity node to be written. For other node types, something sensible should be written, but the exact serialized form is not specified.
Returns:
Returns true if node was successfully serialized and false in case a failure occured and the failure wasn't canceled by the error handler.
Throws:
DOMSystemException - This exception will be raised in response to any sort of IO or system error that occurs while writing to the destination. It may wrap an underlying system exception.

writeToString

public java.lang.String writeToString(org.w3c.dom.Node wnode)
                               throws org.w3c.dom.DOMException
Serialize the specified node as described above in the description of DOMWriter. The result of serializing the node is returned as a string. Writing a Document or Entity node produces a serialized form that is well formed XML. Writing other node types produces a fragment of text in a form that is not fully defined by this document, but that should be useful to a human for debugging or diagnostic purposes.
Parameters:
wnode - The node to be written.
Returns:
Returns the serialized data, or null in case a failure occured and the failure wasn't canceled by the error handler.
Throws:
org.w3c.dom.DOMException - DOMSTRING_SIZE_ERR: The resulting string is too long to fit in a DOMString.


Copyright © 1999-2002 Apache XML Project. All Rights Reserved.