|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
DOMWriter provides an API for serializing (writing) a DOM document out in an XML document. The XML data is written to an output stream, the type of which depends on the specific language bindings in use. During serialization of XML data, namespace fixup is done when possible.
DOMWriter
accepts any node type for serialization. For
nodes of type Document
or Entity
, well formed
XML will be created if possible. The serialized output for these node
types is either as a Document or an External Entity, respectively, and is
acceptable input for an XML parser. For all other types of nodes the
serialized form is not specified, but should be something useful to a
human for debugging or diagnostic purposes. Note: rigorously designing an
external (source) form for stand-alone node types that don't already have
one defined in seems a bit much to take on here.
Within a Document or Entity being serialized, Nodes are processed as
follows Documents are written including an XML declaration and a DTD
subset, if one exists in the DOM. Writing a document node serializes the
entire document. Entity nodes, when written directly by
writeNode
defined in the DOMWriter
interface,
output the entity expansion but no namespace fixup is done. The resulting
output will be valid as an external entity. Entity References nodes are
serializes as an entity reference of the form
"&entityName;"
) in the output. Child nodes (the
expansion) of the entity reference are ignored. CDATA sections
containing content characters that can not be represented in the
specified output encoding are handled according to the
"split-cdata-sections" feature.If the feature is true
, CDATA
sections are split, and the unrepresentable characters are serialized as
numeric character references in ordinary content. The exact position and
number of splits is not specified. If the feature is false
,
unrepresentable characters in a CDATA section are reported as errors. The
error is not recoverable - there is no mechanism for supplying
alternative characters and continuing with the serialization. All other
node types (Element, Text, etc.) are serialized to their corresponding
XML source form.
Within the character data of a document (outside of markup), any characters that cannot be represented directly are replaced with character references. Occurrences of '<' and '&' are replaced by the predefined entities < and &. The other predefined entities (>, &apos, etc.) are not used; these characters can be included directly. Any character that can not be represented directly in the output character encoding is serialized as a numeric character reference.
Attributes not containing quotes are serialized in quotes. Attributes containing quotes but no apostrophes are serialized in apostrophes (single quotes). Attributes containing both forms of quotes are serialized in quotes, with quotes within the value represented by the predefined entity ". Any character that can not be represented directly in the output character encoding is serialized as a numeric character reference.
Within markup, but outside of attributes, any occurrence of a character that cannot be represented in the output character encoding is reported as an error. An example would be serializing the element <LaCañada/> with the encoding="us-ascii".
When requested by setting the normalize-characters
feature
on DOMWriter
, all data to be serialized, both markup and
character data, is W3C Text normalized according to the rules defined in
. The W3C Text normalization process affects only the data as it is being
written; it does not alter the DOM's view of the document after
serialization has completed.
Namespaces are fixed up during serialization, the serialization process will verify that namespace declarations, namespace prefixes and the namespace URIs associated with Elements and Attributes are consistent. If inconsistencies are found, the serialized form of the document will be altered to remove them. The algorithm used for doing the namespace fixup while seralizing a document is a combination of the algorithms used for lookupNamespaceURI and lookupNamespacePrefix . previous paragraph to be defined closer here.
Any changes made affect only the namespace prefixes and declarations appearing in the serialized data. The DOM's view of the document is not altered by the serialization operation, and does not reflect any changes made to namespace declarations or prefixes in the serialized output.
While serializing a document the serializer will write out
non-specified values (such as attributes whose specified
is
false
) if the output-default-values
feature is
set to true
. If the output-default-values
flag
is set to false
and the use-abstract-schema
feature is set to true
the abstract schema will be used to
determine if a value is specified or not, if
use-abstract-schema
is not set the specified
flag on attribute nodes is used to determine if attribute values should
be written out.
Ref to Core spec (1.1.9, XML namespaces, 5th paragraph) entity ref description about warning about unbound entity refs. Entity refs are always serialized as &foo;, also mention this in the load part of this spec.
When serializing a document the DOMWriter checks to see if the document element in the document is a DOM Level 1 element or a DOM Level 2 (or higher) element (this check is done by looking at the localName of the root element). If the root element is a DOM Level 1 element then the DOMWriter will issue an error if a DOM Level 2 (or higher) element is found while serializing. Likewise if the document element is a DOM Level 2 (or higher) element and the DOMWriter sees a DOM Level 1 element an error is issued. Mixing DOM Level 1 elements with DOM Level 2 (or higher) is not supported.
DOMWriter
s have a number of named features that can be
queried or set. The name of DOMWriter
features must be valid
XML names. Implementation specific features (extensions) should choose an
implementation dependent prefix to avoid name collisions.
Here is a list of properties that must be recognized by all implementations.
"normalize-characters"
true
false
"split-cdata-sections"
true
false
CDATASection
contains an
unrepresentable character. "validation"
true
use-abstract-schema
to true
. false
"expand-entity-references"
true
EntityReference
nodes when serializing. false
EntityReference
nodes as XML entity references. "whitespace-in-element-content"
true
false
isWhitespaceInElementContent
flag on Text
nodes
to determine if a text node should be written out or not. "discard-default-content"
true
specified
flag on Attr
nodes,
and so on) to decide what attributes and content should be serialized or
not. Note that the specified
flag on Attr
nodes
in itself is not always reliable, it is only reliable when it is set to
false
since the only case where it can be set to
false
is if the attribute was created by a Level 1
implementation. false
"format-canonical"
true
false
"format-pretty-print"
true
false
See also the Document Object Model (DOM) Level 3 Abstract Schemas and Load and Save Specification.
Method Summary | |
boolean |
canSetFeature(java.lang.String name,
boolean state)
Query whether setting a feature to a specific value is supported. |
java.lang.String |
getEncoding()
The character encoding in which the output will be written. |
DOMErrorHandler |
getErrorHandler()
The error handler that will receive error notifications during serialization. |
boolean |
getFeature(java.lang.String name)
Look up the value of a feature. |
java.lang.String |
getLastEncoding()
The actual character encoding that was last used by this formatter. |
java.lang.String |
getNewLine()
The end-of-line sequence of characters to be used in the XML being written out. |
void |
setEncoding(java.lang.String encoding)
The character encoding in which the output will be written. |
void |
setErrorHandler(DOMErrorHandler errorHandler)
The error handler that will receive error notifications during serialization. |
void |
setFeature(java.lang.String name,
boolean state)
Set the state of a feature. |
void |
setNewLine(java.lang.String newLine)
The end-of-line sequence of characters to be used in the XML being written out. |
boolean |
writeNode(java.io.OutputStream destination,
org.w3c.dom.Node wnode)
Write out the specified node as described above in the description of DOMWriter . |
java.lang.String |
writeToString(org.w3c.dom.Node wnode)
Serialize the specified node as described above in the description of DOMWriter . |
Method Detail |
public void setFeature(java.lang.String name, boolean state) throws org.w3c.dom.DOMException
DOMWriter
to recognize a feature
name but to be unable to set its value.name
- The feature name.state
- The requested state of the feature (true
or
false
).org.w3c.dom.DOMException
- Raise a NOT_SUPPORTED_ERR exception when the DOMWriter
recognizes the feature name but cannot set the requested value.
DOMWriter
does not
recognize the feature name.public boolean canSetFeature(java.lang.String name, boolean state)
name
- The feature name, which is a DOM has-feature style string.state
- The requested state of the feature (true
or
false
).true
if the feature could be successfully set to
the specified value, or false
if the feature is not
recognized or the requested value is not supported. The value of
the feature itself is not changed.public boolean getFeature(java.lang.String name) throws org.w3c.dom.DOMException
name
- The feature name, which is a string with DOM has-feature
syntax.true
or
false
).org.w3c.dom.DOMException
- Raise a NOT_FOUND_ERR When the DOMWriter
does not
recognize the feature name.public java.lang.String getEncoding()
null
or empty, but the item to be
written includes an encoding declaration, that value will be used.If
neither of the above provides an encoding name, a default encoding of
"UTF-8" will be used.
null
.public void setEncoding(java.lang.String encoding)
null
or empty, but the item to be
written includes an encoding declaration, that value will be used.If
neither of the above provides an encoding name, a default encoding of
"UTF-8" will be used.
null
.public java.lang.String getLastEncoding()
public java.lang.String getNewLine()
null
null
.public void setNewLine(java.lang.String newLine)
null
null
.public DOMErrorHandler getErrorHandler()
public void setErrorHandler(DOMErrorHandler errorHandler)
public boolean writeNode(java.io.OutputStream destination, org.w3c.dom.Node wnode) throws java.lang.Exception
DOMWriter
. Writing a Document or Entity node produces a
serialized form that is well formed XML. Writing other node types
produces a fragment of text in a form that is not fully defined by
this document, but that should be useful to a human for debugging or
diagnostic purposes.destination
- The destination for the data to be written.wnode
- The Document
or Entity
node to
be written. For other node types, something sensible should be
written, but the exact serialized form is not specified.true
if node
was
successfully serialized and false
in case a failure
occured and the failure wasn't canceled by the error handler.DOMSystemException
- This exception will be raised in response to any sort of IO or system
error that occurs while writing to the destination. It may wrap an
underlying system exception.public java.lang.String writeToString(org.w3c.dom.Node wnode) throws org.w3c.dom.DOMException
DOMWriter
. The result of serializing the node is
returned as a string. Writing a Document or Entity node produces a
serialized form that is well formed XML. Writing other node types
produces a fragment of text in a form that is not fully defined by
this document, but that should be useful to a human for debugging or
diagnostic purposes.wnode
- The node to be written.null
in case a
failure occured and the failure wasn't canceled by the error
handler.org.w3c.dom.DOMException
- DOMSTRING_SIZE_ERR: The resulting string is too long to fit in a
DOMString
.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |