www.alphaworks.ibm.comwww.ibm.com/developerwww.ibm.com

Home

XML4J Readme
Xerces Readme
Installation

Samples
API JavaDoc
XNI Manual
FAQs

Features
Properties

Release Info
Migration details
Limitations
Report a Bug

Questions
 

Answers
 
What happened to xerces.jar
 

In order to take advantage of the fact that this parser is very often used in conjunction with other XML technologies, such as XSLT processors, which also rely on standard API's like DOM and SAX, xerces.jar was split into two jarfiles:

  • xmlParserAPIs.jar contains the DOM level 2, SAX 2.0 and JAXP 1.1 API's;
  • xercesImpl.jar contains the implementation of these API's as well as the XNI API.

For backwards compatibility, we have retained the ability to generate xerces.jar. For instructions, see the installation documentation.


How do I turn on DTD validation?
 

You can turn validation on and off via methods available on the SAX2 XMLReader interface. While only the SAXParser implements the XMLReader interface, the methods required for turning on validation are available to both parser classes, DOM and SAX.
The code snippet below shows how to turn validation on -- assume that parser is an instance of either org.apache.xerces.parsers.SAXParser or org.apache.xerces.parsers.DOMParser.

parser.setFeature("http://xml.org/sax/features/validation", true);


Why does getElementById() not always work for documents validated against XML Schemas?
 

According to the XML Schema specification, an instance document might have more than one validation root and ID/IDREFS must be unique only within the context of a particular validation root, meaning that a document may potentially contain multiple identical ids. In this case, the output of getElementById() is unspecified. On the other hand, if the document root is a validation root of the document, getElementById() should work as expected.


How do I get access to the PSVI?
 

Xerces provides a sample component PSVIWriter that intercepts document handler events and collects PSVI information. For more information read samples documentation on how to use xni.parser.PSVIParser and xni.parser.PSVIConfiguration.

NoteXerces only produces light-weight PSVI.

What international encodings are supported by Xerces-J?
 
  • UTF-8
  • UTF-16 Big Endian, UTF-16 Little Endian
  • IBM-1208
  • ISO Latin-1 (ISO-8859-1)
  • ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper and Lower Sorbian]
  • ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]
  • ISO Latin-4 (ISO-8859-4)
  • ISO Latin Cyrillic (ISO-8859-5)
  • ISO Latin Arabic (ISO-8859-6)
  • ISO Latin Greek (ISO-8859-7)
  • ISO Latin Hebrew (ISO-8859-8)
  • ISO Latin-5 (ISO-8859-9) [Turkish]
  • Extended Unix Code, packed for Japanese (euc-jp, eucjis)
  • Japanese Shift JIS (shift-jis)
  • Chinese (big5)
  • Chinese for PRC (mixed 1/2 byte) (gb2312)
  • Japanese ISO-2022-JP (iso-2022-jp)
  • Cyrillic (koi8-r)
  • Extended Unix Code, packed for Korean (euc-kr)
  • Russian Unix, Cyrillic (koi8-r)
  • Windows Thai (cp874)
  • Latin 1 Windows (cp1252) (and all other cp125? encodings recognized by IANA)
  • cp858
  • EBCDIC encodings:
    • EBCDIC US (ebcdic-cp-us)
    • EBCDIC Canada (ebcdic-cp-ca)
    • EBCDIC Netherland (ebcdic-cp-nl)
    • EBCDIC Denmark (ebcdic-cp-dk)
    • EBCDIC Norway (ebcdic-cp-no)
    • EBCDIC Finland (ebcdic-cp-fi)
    • EBCDIC Sweden (ebcdic-cp-se)
    • EBCDIC Italy (ebcdic-cp-it)
    • EBCDIC Spain, Latin America (ebcdic-cp-es)
    • EBCDIC Great Britain (ebcdic-cp-gb)
    • EBCDIC France (ebcdic-cp-fr)
    • EBCDIC Hebrew (ebcdic-cp-he)
    • EBCDIC Switzerland (ebcdic-cp-ch)
    • EBCDIC Roece (ebcdic-cp-roece)
    • EBCDIC Yugoslavia (ebcdic-cp-yu)
    • EBCDIC Iceland (ebcdic-cp-is)
    • EBCDIC Urdu (ebcdic-cp-ar2)
    • Latin 0 EBCDIC
    • EBCDIC Arabic (ebcdic-cp-ar1)
NoteUCS-4 is not yet supported, but it is hoped that support will be available soon.