JAXP (Java API for XML Processing)

1. What is JAXP ?

Java API for XML Processing (JAXP) is for processing XML data using applications written in the Java programming language.

2. What does JAXP consist of ?

JAXP leverages the parser standards Simple API for XML Parsing (SAX) and Document Object Model (DOM) so that we can choose to parse our data as a stream of events or to build an object representation of it.

3. Does JAXP support XSLT ?

JAXP also supports the Extensible Stylesheet Language Transformations (XSLT) standard, giving us control over the presentation of the data and enabling us to convert the data to other XML documents or to other formats, such as HTML.

4. How does JAXP resolve namespace conflicts in DTD ?

JAXP provides namespace support, allowing us to work with DTDs that might otherwise have naming conflicts.

5. Does JAXP implement StAX ?

Yes, as of version 1.4, JAXP implements the Streaming API for XML (StAX) standard. StAX APIs defined in javax.xml.stream provide a streaming Java technology-based, event-driven, pull-parsing API for reading and writing XML documents. StAX offers a simpler programming model than SAX and more efficient memory management than DOM.

6. Why was JAXP designed ?

JAXP allows us to use any XML-compliant parser from within our application. It does this with what is called a pluggability layer, which lets us plug in an implementation of the SAX or DOM API. The pluggability layer also allows us to plug in an XSL processor, letting us control how our XML data is displayed.

7. Write the names of packages used in JAXP API ?

javax.xml.parsers
The JAXP APIs, which provide a common interface for different vendors’ SAX and DOM parsers.

org.w3c.dom
Defines the Document class (a DOM) as well as classes for all the components of a DOM.

org.xml.sax
Defines the basic SAX APIs.

javax.xml.transform
Defines the XSLT APIs that let us transform XML into other forms.

javax.xml.stream
Provides StAX-specific transformation APIs.

8. What is SAX ?

Simple API for XML (SAX) is the event-driven, serial-access mechanism that does element-by-element processing.

9. What is DOM ?

Document Object Model (DOM) provides a tree structure of objects. We can use DOM API to manipulate the hierarchy of application objects it encapsulates.

10. Where DOM API is useful ?

DOM API is ideal for interactive applications because the entire object model is present in memory, where it can be accessed and manipulated by the user.

11. Why is DOM CPU- and memory-intensive ?

Constructing the DOM requires reading the entire XML structure and holding the object tree in memory, so it is much more CPU- and memory-intensive.

12. Does SAX require entire document should be kept in memory ?

No. SAX API tends to be preferred for server-side applications and data filters that do not require an in-memory representation of the data.

13. What is XSLT API ?

XSLT APIs defined in javax.xml.transform let us write XML data to a file or convert it into other forms. We can even use it in conjunction with the SAX APIs to convert legacy data to XML.

14. What is SAXParserFactory ?

A SAXParserFactory object creates an instance of the parser determined by the system property, javax.xml.parsers.SAXParserFactory.

15. What is SAXParser ?

The SAXParser interface defines several kinds of parse() methods. In general, we pass an XML data source and a DefaultHandler object to the parser, which processes the XML and invokes the appropriate methods in the handler object.

16. What is SAXReader ?

The SAXParser wraps a SAXReader. It is the SAXReader that carries on the conversation with the SAX event handlers we define.

17. What is DefaultHandler ?

The DefaultHandler implements the ContentHandler, ErrorHandler, DTDHandler, and EntityResolver interfaces (with null methods), so we can override only the ones we are interested in.

18. What is ContentHandler ?

Methods such as startDocument, endDocument, startElement, and endElement are invoked when an XML tag is recognized. This interface also defines other methods called characters() and processingInstruction(), which are invoked when the parser encounters the text in an XML element or an inline processing instruction, respectively.

19. What is ErrorHandler ?

Methods error(), fatalError(), and warning() are invoked in response to various parsing errors. The default error handler throws an exception for fatal errors and ignores other errors (including validation errors). This is one reason we need to know something about the SAX parser, even if we are using the DOM. Sometimes, the application may be able to recover from a validation error. Other times, it may need to generate an exception. To ensure the correct handling, we will need to supply our own error handler to the parser.

20. What is DTDHandler ?

Defines methods we will use when processing a DTD to recognize and act on declarations for an unparsed entity.

21. What is EntityResolver ?

The resolveEntity method is invoked when the parser must identify data identified by a URI. In most cases, a URI is simply a URL, which specifies the location of a document, but in some cases the document may be identified by a URN – a public identifier, or name, that is unique in the web space. The public identifier may be specified in addition to the URL. The EntityResolver can then use the public identifier instead of the URL to find the document – for example, to access a local copy of the document if one exists.

22. List down the names of SAX packages ?

org.xml.sax

Defines the SAX interfaces. The name org.xml is the package prefix that was settled on by the group that defined the SAX API.

org.xml.sax.ext

Defines SAX extensions that are used for doing more sophisticated SAX processing-for example, to process a document type definition (DTD) or to see the detailed syntax for a file.

org.xml.sax.helpers

Contains helper classes that make it easier to use SAX – for example, by defining a default handler that has null methods for all the interfaces, so that we only need to override the ones we actually want to implement.

javax.xml.parsers

Defines the SAXParserFactory class, which returns the SAXParser. Also defines exception classes for reporting errors.

23. List down the names of DOM packages ?

org.w3c.dom

Defines the DOM programming interfaces for XML (and, optionally, HTML) documents, as specified by the W3C.

javax.xml.parsers

Defines the DocumentBuilderFactory class and the DocumentBuilder class, which returns an object that implements the W3C Document interface. The factory that is used to create the builder is determined by the javax.xml.parsers system property, which can be set from the command line or overridden when invoking the new Instance method. This package also defines the ParserConfigurationException class for reporting errors.

24. List down the names of XSLT packages ?

javax.xml.transform

Defines the TransformerFactory and Transformer classes, which we use to get an object capable of doing transformations. After creating a transformer object, we invoke its transform() method, providing it with an input (source) and output (result).

javax.xml.transform.dom

Classes to create input (source) and output (result) objects from a DOM.

javax.xml.transform.sax

Classes to create input (source) objects from a SAX parser and output (result) objects from a SAX event handler.

javax.xml.transform.stream

Classes to create input (source) objects and output (result) objects from an I/O stream.

25. List down the names of StAX packages ?

javax.xml.stream

Defines the XMLStreamReader interface, which is used to iterate over the elements of an XML document. The XMLStreamWriter interface specifies how the XML should be written.

javax.xml.transform.stax

Provides StAX-specific transformation APIs.