CS639 Class 9
Handout: XML Schemas and namespaces
Example 1.7 on page 30 has two default namespaces that operate on different regions of the XML tree
Namespaces have scope: in this example, the Orders ns has scope of everything (all elements) at or below (decendents In the tree) the Orders element. The Address ns has scope at or below the ShipTo element
When contending namespaces have no prefix (i.e. are default namespaces), the more local namespace is used. If they can be discriminated by prefix, then either can be used if in scope.
Attributes. Also note that
a default namespace only defaults the namespace for element names, not
attribute names (among names). All the attributes in Example 1.7 have no
prefix, so they are in no namespace at all (pg. 31). The namespace does not
include the whole “vocabulary” of the XML (element and attribute names), but a
subset of names important for reference purposes.
Note on “among names”: we’ll see that content typed as QName can also use prefixed names that use default namespaces.
What about “really plain” XML,
without any namespaces, like we have been using?
Then the element names are in no namespace at all, an awkward state if other namespaces are in use. We’ll set up namespaces for our tagnames when we need to work with other namespaces. Our schema documents so far have been using the xsd: prefixes, for names of the XML Schema. At the same time, such a schema describes the application-level elements whose names are outside any namespace.
Namespaces and DTDs
· These can be made to work together but not too many people try. The DTD needs to use qualified names (a:b), and the prefix on the DTD needs to match the prefix on the XML document, an awkward requirement.
· We’ll avoid this combo
· Note that SOAP (basic protocol of web services) disallows DTDs in the message, although you could extract the contents and then use a DTD on it.
Namespaces without XML Schemas
· This works fine. There is no “namespace document”—the namespace is just id’d by its URI. Pretty common in REST: try to find a schema in our REST book! Can use a namespace to say “this XML document format is for company abc”, and never have a DTD or XML Schema for it.
Namespaces and XML Schema
· This is the natural combo for serious work with multiple sources of XML, where you need namespaces to keep things straight, and want validation.
· With namespaces in use, an XML Schema document defines local element names for a certain namespace, the “target namespace”, specified by targetNamespace=”…” in the schema element at the start of the doc.
· Without namespaces in use, the names of a schema are considered outside any namespace, and the schema document has no “target namespace.”
See handout for first example
of XML+XMLSchema based on Example 2.21, pg. 103
Note that the NS URI is used twice, once to spec the target NS, and again
to spec the default NS for this XML file. This is the most convenient way to do
it, avoiding repetitive prefixes for the app tagnames such as getQuote and symbol,
and app typenames such as StockSymbol.
Each name=”something” for elements and named types in the schema is
putting that name into the namespace.
Here getQuote, symbol, and StockSymbol are put in the NS as local names.
The schema says more than this, of course.
(Also, identical names can be put in the NS from different parts of the
schema, so a NS is not a pure set of names.)
When we use prefixes for both namespaces in the XML schema, it shows that
the name=xxx have xxx the local name. This makes sense because name=xxx
introduces a new name into the target NS, and the target NS only, so there’s no
way to “cheat” and put a name into another NS.
This is enforced by the schema of schemas (i.e., to be useful, a schema
needs to follow this master schema). The
type of name’s value is NCName: see pg. 62 of Fig. Table 2.1. On the other hand, the other uses of the
local names (here only one, type=”x:StockSymbol”) come with prefixes, and are
typed QName.
The
fine print: we are assuming that if a schema is in use, it falls in one of two
categories:
Otherwise we would have 3 more cases to study. If we ever bump into one of the other cases (targetNamespace=…,
with “elementFormDefault=”unqualified”
or “attributeFormDefault=”qualified”, or both), we can discuss it.
Use
of Example 2.21 schema quote.xsd
This schema only has one top-level xsd:element, for
getQuote, so this schema can only validate XML docs with a getQuote element at
the root, such as quote.xml. The other top-level construct is a type
declaration, which does not match directly to XML, but helps with the getQuote
declaration. The type name StockSymbol
is a local name in the NS.
Example 2.21, pg.
103:
Case of XML Schema with a target NS.
This example shows how we want to do such schemas, as discussed last
class.
Here getQuote, symbol, and StockSymbol are put in the NS. The schema says
more than this, of course.
Note new directory of examples, $cs639/validate-ns, for XML Schema with namespaces, including quote*.*.
Note that all the element tagnames of the request XML (see pg. 97, inside
the SOAP envelope, quoted below), getQuote and symbol, are described in the
schema, and additionally, the type-name StockSymbol is put in the namespace by
the schema. Thus the namespace has more names than are actually used in the XML
request document.
Request XML from pg. 97, with default N.S:
<getQuote xmlns=”http://namespaces.cafeconleche.org/xmljava/ch2/”>
<symbol>RHAT</symbol>
</getQuote>
Response XML from pg. 97:
<Quote xmlns=”http://namespaces.cafeconleche.org/xmljava/ch2/> <--same namespace!
<Price>4.12</Price>
</Quote>
The response XML uses element tagnames Quote and Price, and although the
XML has the same namespace http://.../ch2 associated
with it, the corresponding schema with that target NS (pg 103) does not describe
Quote and Price. This is OK. There is no
rule that the schema has to declare all the
names in the namespace if it is associated with it by targetNS. Additional names
are brought in by having the NS on XML docs. It just means that this schema can
only be used to validate the request XML, not the response XML.
However, we could expand the schema so that it does cover both request
and response, by adding a top-level (child of <schema>) <xsd:element name=”Quote”>,
etc. to it. With two top-level <xsd:element>
elements in the schema, the same schema can validate XML with root tag <getQuote>
or root tag <Quote>. This is how I would do it. The schema then describes the whole XML
document interchange, the arrangement between the sender and receiver.
More
Fine print: if we put “elementFormDefault=”unqualified” (or nothing about
elementFormDefault, since this is the default) in the schema, then we would not
use the prefix on symbol. But we would still need to put it on getQuote, because it’s a “global element”, one defined at top
level in the schema (its element node is a direct child of the schema’s schema
node.) This need to know whether each element is global or not, while writing a
conforming XML document, is what makes this setting of elementFormDefault so
hard to use. Because it’s the default, however, you may see it in practice.
Some authors go to the extreme of suggesting that to avoid this debate, you
should make all elements global. See pa1bsoln/JavaSourceUsingRefs.xsd for a
schema that makes all elements global. It has the advantage that all subtrees
of the original document tree are valid, but that can also be a disadvantage if
you want to make sure all documents give the full tree.
Namespaces and attributes
There are two kinds of attributes: ones belonging to a certain element (the examples we’ve seen) and “global attributes” defined at top level in the schema—look at next time.
Consider attributes belonging to an element now.
Also note that a
default namespace only defaults the NS for elements. That is OK, since we do not need prefixes for
element attributes.
Attribute
Example
Example: book4.xml in $cs639/validate-ns:
<?xml
version="1.0"?>
<b:book
xmlns:b="http://schemas.cs.umb.edu/book"> ß
our URI for book NS
<b:title>Data
on the Web</bk:title>
.
.
.
<b:image
source="csearch.gif"/> ß no b: on attribute name, because it’s
“unqualified”
.
.
.
</b:book>
The attribute names belonging to elements are in no namespace, but of course the schema knows about them and checks them. They can be thought of as “tagging along” with their element.
From book.xsd in $cs639/validate:
<xsd:complexType name="ImageType">
<xsd:attribute name="source"
type="xsd:string"/>
</xsd:complexType>
That’s too simple. Suppose an image had a child element:
<b:image source="csearch.gif">
<b:size>
30 </b:size>
</b:image>
<xsd:complexType name="ImageType">
<xsd:sequence>
<xsd:element name=”size” type=”xsd:string”/>
</xsd:sequence>
<xsd:attribute name="source"
type="xsd:string"/>
</xsd:complexType>
There are attributes that don’t belong to certain elements. See example on pg. 31. Next time.
Other examples in $cs639/validate-ns:
book6.xml has this namespace with linkage to XML Schema via xsi:schemaLocation="http://schemas.cs.umb.edu/book book1.xsd".
book3.xml has default NS
book5.xml has default NS + linkage to XML Schema
Many cases to deal with: XML + DTD or not + schema or not + namespace
or not—
No Namespaces: our previous coverage:
XML + DTD
XML + XSD (with no targetNS)
XML+ DTD+XSD: uncommon
With Namespaces: DTDs don’t play well
with namespaces, so we only consider XSDs in combo with NSs
XML + NS
XML + NS +XSD (with targetNS = NS)
However, note that a
NS doesn’t by itself have a NS document the way the DTD and XSD have docs. It’s just an identifier URI attached to XML to
disambiguate names.
XML + NS case: only
the XML is a document. The NS is just a
construct, holding all the prefixed local names, plus element names if it’s
being used as a default namespaces.
In the XML+NS+XSD
case, the XSD has the NS URI as its target namespace, so the XSD serves as a
document for the NS, along with its job to express the XML’s structure. Thus
the NS URI bridges between the XML doc and the XSD doc.
Well-known schemas
are found by parsers at well-known places
Parsers need help
finding application schemas. Need to
cover global attributes for this.
Example of a standard namespace in the book – XInclude
- pg 29 (confusing because early)
-
<Orders xmlns:xi=“http://www.w3.org/2001/XInclude”>
->this is the
XInclude “recommendation”, i.e., standard.
<xi:include
href=”order_details.xml”/>
</Order>
Here “include” is a name in this XInclude namespace, and if it has a schema, then href should appear as this element’s attribute (and not in any namespace). The include element says where to find an XML doc to include in this one, like “#include in C”.
Can we run this through sax.Counter? No, the SAX parser doesn’t handle XInclude.
You need a XInclude tool to turn this into XML with the inclusion done. You can
find a XInclude tool written as an XSL app on the Internet. Might be useful
someday.
Global Attributes: These can be attached to elements of other namespaces. Look at XLink example, pg. 31. xlink:type and xlink:href are global attributes, belonging to the “http://.../xlink” namespace but added to an element of the “http:/.../Address” namespace.
<ShipTo
xmlns="http://ns.cafeconleche.org/Address/"
xmlns:xlink="http://www.w3.org/1999/xlink"
xlink:type="simple"
xlink:href="mailto:chezfred@yahoo.com"> <--global attributes
<GiftRecipient>Samuel
Johnson</GiftRecipient>
<Street>271 Old Homestead
Way</Street >
<City>Woonsocket</City>
<State>RI</State> <Zip>02895</Zip>
</ShipTo>
If there is no schema in use for the Address NS, this is easily done. Again, you need a tool or framework that understands XLink to really put this to use.
If there is a schema for the Address NS, it must give permission for the “extra” attributes, or validation will fail. The simplest way is to allow any attribute: See address.xsd in $cs639/validate-ns for this:
<?xml
version="1.0"?>
<xsd:schema
targetNamespace="http://ns.cafeconleche.org/Address/"
xmlns="http://schemas.cs.umb.edu/book"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xsd:element name="ShipTo">
<xsd:complexType>
<xsd:sequence>
<xsd:element
name="GiftRecipient" type="xsd:string” minOccurs="0"
maxOccurs="unbounded"/>
<xsd:element
name="Street" type="xsd:string"/>
<xsd:element
name="City" type="xsd:string"/>
<xsd:element
name="State" type="xsd:string"/>
<xsd:element name="Zip"
type="xsd:string"/>
</xsd:sequence>
<xsd:anyAttribute/>
<---add to schema to allow global attributes (or any others)
</xsd:complexType>
</xsd:element>
</xsd:schema>
As of class time, I had not made this work. With the help of the validation service at http://www.w3.org/2001/03/webdata/xsv, with its somewhat more useful error messages, I have succeeded.
You also must provide the parser with access to the schema that establishes the attribute as a global attribute, namely, the XLink schema. A Google search found it at http://www.loc.gov/standards/mets/xlink.xsd. We tell this important fact to the parser by linkage or other means (here linkage). However, the validation still failed, reporting that “type” was not allowed as an attribute for ShipTo.
Turns out “type” is in fact not a global attribute of the XLink schema but href is, so the final XML that works is:
address.xml in $cs639/validate-ns:
<?xml version="1.0"?>
<ShipTo
xmlns="http://ns.cafeconleche.org/Address/"
xmlns:xlink="http://www.w3.org/1999/xlink"
xsi:schemaLocation="http://www.w3.org/1999/xlink
http://www.loc.gov/standards/mets/xlink.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xlink:href="mailto:chezfred@yahoo.com"> <!-- removed xlink:type here -->
<GiftRecipient>Samuel Johnson</GiftRecipient>
<Street>271 Old Homestead Way</Street >
<City>Woonsocket</City> <State>RI</State>
<Zip>02895</Zip>
</ShipTo>
java sax.Counter -s -v -schema
address.xsd address.xml
address.xml: 598 ms (6 elems, 2 attrs, 0
spaces, 79 chars)
To test it at http://www.w3.org/2001/03/webdata/xsv: enter the URLs for these files in the form:
http://www.cs.umb.edu/cs639/validate-ns/address.xml
http://www.cs.umb.edu/cs639/validate-ns/address.xsd
Linkage to Schema from XML: done with global attributes
Recall book2.xml, with its linkage to schema. This is done with the help of the XMLInstance namespace, another standard namespace, and its global attribute noNamespaceSchemaLocation:
<?xml version="1.0"
encoding="ISO-8859-1"?>
<book
xsi:noNamespaceSchemaLocation="book.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<title>Data on the Web</title> ...
This is an example of a global attribute (noNamespaceSchemaLocation) that doesn’t need to be in the schema, since it’s part of the infrastructure.
Here the xsi prefix for the XML instance NS is set up with the xmlns:xsi=”URI_of_XMLInstance”. By XML instance we mean the XML document itself, rather than the schema. The XML document needs to point to its schema, which it does with the help of the XSI namespace.
Obviously we need a different construct for the XSD linkage for the case of NS+XSD
· Without namespace: xsi:noNamespaceSchemaLocation with value “URL”
· With namespace: xsi:schemaLocation with value “URI URL”, two URL-syntax strings separated by whitespace, the first for the namespace URI and the second for the schema’s URL.
Example of second form:
address.xml above.