Note that the midterm is open
books, notes, handouts, etc.
Documents to study
Harold’s text
XML 1.0 Spec
XML Bible free Chap. 20 on XML Schemas
Documents referenced in homework.
Text Coverage
Chap. 1: Basic XML--everything except a few things described below.
pg. 20 The rules here are about use of Unicode for the
characters in a document. It corresponds to the Char ::=
rule in Sec. 2.2 of the XML1.0 spec. Additionally, the CharData rule of Sec. 2.4 says that < and & must be
escaped, etc. This is noted for attribute values on pg. 21 of the text
but applies also to element content.
pg. 24-27: you can skip Processing Instructions and Entities. The only kind of
entities we cover are the built-in entities <, > and &, which
work without a DTD. (Also ' and " are in this group.) You should also be aware of
character references, which are similar to but not actually entities, like
• to represent Unicode 0x2022.
pg. 34: Skip NMTOKEN*, ENTITY*, IDREFS, NOTATION
Pg. 38: Ex. 1.9. We have not covered xsd:simpleContent or xsd:extension
yet, or the details of handling attributes in XML Schema. The xsd:element for “Product” is a
good example of structure we have studied.
pg. 41. Skip Schematron
pg, 44-53: Just know what XSL and its two parts, XSLT and XSL-FO do. Note
that the select="Customer", etc., of XSL use XPath
queries to locate data. Note we can use the JDK to do XSL processing. We notes that XSL
can do XInclude processing.
Chap. 2
We covered everything to pg. 73, RSS, and you can skip
RSS. We will later look at Atom, a more current syndication format.
Then we covered everything from pg. 77 to pg. 82, but then you can skip XML-RPC, and everything to SOAP, pg. 96.
Then we covered SOAP to pg. 99, and then skipped Faults and Encoding Styles and
the rest of the chapter. We will never cover SOAP Encoding, as it is now
considered obsolete. But the SOAP Envelope is crucial to classic web services.
Note the difference between SOAP-ENV: and SOAP-ENC: and skip anything with
SOAP-ENC:.
Chap. 3
We have covered everything to pg. 142, a Simple SOAP
client, but not that section. Also, the important character set for XML
is UTF-8, the encoding of Unicode in 8-bit bytes, using multiple bytes for
non-ASCII characters. ISO-8859-1, aka Latin-1, is the default for HTML,
and is not recommended (although tolerated) for XML.
Then we covered the section on Servlets, pp. 145-148.
Note that we always use a web.xml with our servlets,
so don’t worry about servlets running without a
web.xml (discussed on p. 148.)
Chap. 5
We covered all the material on SAX, DOM and JAXP here.
Note that Example 5.5 does not work with just Java 6, because of the
Apache packages needed. However, Example 5.6 does work with Java 6, and
is almost the same in terms of DOM use, so it is the one we're covering.
Chap. 6 SAX
We covered everything except Receiving Processing
Instructions, Receiving Skipped Entities, and Receiving Locators.
Note SAX APIs starting on pg. 869.
Chap. 7 SAX, continued.
We covered everything except EntityResolver, lexical
handler (pg. 339), declaration handler (pg. 343). The only Xerces custom features we covered were for validation. We didn't cover the DTDHandler
interface, but its ability to provide default attribute values and attribute
type is of note.
Chap. 9 DOM
We are concentrating on DOM2, although DOM3 (in its
scaled-down present form) is available in Java 6. Note that Example 9.1 depends
on an Oracle package. We can get a DOMImplementation
from a Document object.--see pg. 897.
You can skip Application-Specific DOMs
pg. 441 on--Ignore Notation nodes, entity nodes, entity reference nodes
(built-in entities are resolved for us, so not there in the DOM)
Various parsers--we are only covering Xerces, since
it is in Java 6, via JAXP, so you can skip from "Parsing Documents with a
DOM Parser" to "JAXP DocumentBuilder and DocumentBuilderFactory", the way we'll do it.
You can skip "DOM3 Load and Save" it’s obsolete: see the supplied
$cs639/FibonacciEx.java to see saving of the DOM tree using current calls.
Note DOM APIs in Appendix.
Read to pg. 478, up to but not including Modifying the Tree. Then read pp.
489-490.
Chap. 10 DOM, continued
We looked at Example 10.5 to see how to use a default namespace, and, with a modification of the code, to use namespace prefixes too. Use the program as written for the default namespace case, or drop DocumentType creation and use null for the third argument of createDocument. See $cs639/dom/SimpleSVG.java. Replace “svg” in the createDocument call with “s:svg” and “desc” in the createElementNS call with “s:desc” to add prefixes ($cs639/dom/SimpleSGG1.java) The bug described on pg. 504 has been fixed for namespaces, so the output will have <svg xmlns=”http://www.w3.org/2000/svg”> in the default namespace case (fix the first line of pg. 504), and <s:svg s:xmlns=”http://www.w3.org/2000/svg”> in the prefixed variation.
New: Note the appendix on the DOM API, pp.
891-908. Fix Table A.1: Attr’s parent is null, not
Element—seen pp 894-895, need to use getOwnerElement() to get the enclosing Element. Add public String getTextContent() to Node’s
API on pg. 905. It returns the concatenated text content of this node and its
descendents (not including comments), especially useful for Element nodes, but
also provides contents of text nodes and comment nodes.
Chap 16 XPath
We covered the intro part, to pg. 759, i.e., to
"Location Paths", but no further.