CS639 class 26

Final, scheduled for Thurs afternoon at 3: OK?

Pa3 notes: Everyone used DOM. Posted solution uses DOM. I could also post a solution using JAXB, to show that approach.

Various approaches were used to analyze the DOM for the schema, which has a namespace. One way, also shown in the solution, sets up the NamespaceContext for a namepace-aware DOM so that full XPath can be used, then uses XPath

"//xs:simpleType[@name=\""+ key + "\"]/xs:restriction/xs:enumeration";

where key = “milk”, “size”, and “drink” and “location”.

The other two solutions avoided using full XPath, and used namespace-unaware DOMs. One used an element-name-free XPath //@value and checked out the resulting hits by climbing up the DOM tree, and the other used DOM Element’s getElementsByTagName to find xs:complexType elements, then one with name attribute “item”, then the children to get “milk”, etc. Then getElementsByTagName to find xs:simpleType elements whose name attribute is already known, and getElementsByTagName to find their descendent xs:enumeration elements, which finally yield the needed strings.

SOAP Web Services

Read REST book, Chap. 11 The Web and WS-*

If interested in WSDL, see tutorial at w3schools.com

But start now with Harold, pp. 96-99, 116-118, plus Appendix B, pp. 969-972,

firmly skipping all SOAP-ENC material (obsolete)

Basic scheme: use HTTP POST for all service request/responses, to a single service endpoint, i.e. one URL for the whole service. Use XML messages enclosed in SOAP “envelopes”.

So no idea of URI for each resource here. Orders and payments, etc, are still described in XML as in REST, but are all hidden inside the SOAP server. You ask, in XML, for what you want and the server returns it.

The basic rules of designing WS actions is the same for SOAP and REST. Invent actions that depend only on the message and persistent state held by the server—that is the basic “stateless” rule. In other words, don’t expect the server to remember what you were doing last, even if you have to authenticate yourself to get the service.

So it’s straightforward to convert a REST service to a SOAP service or vice versa. You can even reuse the schema for the “representations” of REST—now they are related to the public view of the persistent objects being manipulated by the service. We will be looking at Amazon S3 storage service. It uses one schema for both its SOAP and REST APIs.

In SOAP, there is a way to find out what requests the server can answer, via its WSDL. Unlike WADL, WSDL is widely supported and used. REST proponents would say that REST doesn’t really need WADL, since the service should give you what to do next at each step. But it would be great to know how to get started.

Pg. 97 Basic examples of SOAP messages

<?xml version=”1.0”?>

<SOAP-ENV:Envelope

xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”> ßSOAP-ENV is a prefix

<SOAP-ENV:Body>

</getQuote>

</SOAP-ENV:Body>

</SOAP-ENV:Envelope>

and similar message for response

Here we see local names Envelope and Body of the SOAP envelope NS.

Notes on this simple SOAP message:

- envelope in one standard NS

- body in an app-related NS

- no address

- interoperable (like REST)

o J2EE (TOMCAT, WEBSPHERE)

.NET solution

Internationalized: XML in UTF-8 (like REST)

Lack of address: keeping it free of transport mechanism: could be HTTP, SMTP, ..., as we saw for media types used in REST.

Schema for this SOAP Message XML

Schema for the body: See pg. 103 for schema for the request message, trading.xsd. It has our usual setup: default NS = targetNS, elementFormDefault=”qualified”.

<xsd:element name=”symbol” type=”...>

This is introducing the local name “symbol” for the target NS.

type = “StockSymbol” refers to the StockSymbol in the default NS.

We now look at how the SOAP envelope schema can handle the message part inside, which is designed by the app developers. It’s similar to the XHTML example of class 11, except it tries to validate (processContents = “lax”) rather than skipping over the enclosed XML (processContents = “skip”).

Understanding the SOAP Envelope Schema

Not covered in class:

If we look at the SOAP Envelope schema in Appendix B, pg. 969, we see a different starting setup than we have been using for schemas:

<xsd:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:tns="http://schemas.xmlsoap.org/soap/envelope/”

targetNamespace=”http://schemas.xmlsoap.org/soap/envelope/”

…

We see there is no elementFormDefault=”qualified”. It doesn’t matter, though, because all element definitions are global, i.e, their <element> definitions are direct children of <schema>.

Here “tns” means target NS, which it is by the third line.

So there are more prefixes in this schema than we usually see:

<xs:element name=”Envelope” type = “tns:Envelope”/>

^^^

This prefix makes it very clear that the type name is in the NS, and strangely, they have both the element name “Envelope” and the type name “Envelope”, both in the namespace. There is one localname “Envelope” in the NS, with double duty as an element name and as a type name. Similarly with Body.

It uses <any ...> to allow the app to fill in any message format in XML. The <any> element is on pg. 971, inside schema element name=”Body”. The processContents = “lax” means try to use a schema if you can find one. The XML document should have a NS declaration at this element that can be used to match up a schema.

Validating a whole SOAP message: need to import one schema into another

This is another example of schema using two schemas together, here the SOAP envelope schema and the app schema for the contents of Body. Need to import one schema into another.

It’s a little different than our orderDap.xsd + Link.xsd because the outer schema doesn’t itself use the elements, etc., defined in the imported schema. Consequently, it doesn’t need to define a prefix for the imported NS. Also, we aren’t in control of the outer schema (we shouldn’t edit it), so we need to expand from it in our own document, and this is where <include> comes in to use.

Harold shows us the way here, on pg. 117-118. Drop the import of “.../soap/encoding”, which we’re not using, but keep the xsd:import of trading.xsd, and the xsd:include of the SOAP envelope schema.

The <include> brings in the SOAP envelope document the way #include does in C, i.e., verbatim, so at top level of the result is the SOAP envelope schema. The <import> allows another NS’s schema to be helping out with the full schema.

It’s better to avoid the local URL in the schemaLocation value (here or anywhere), unless you are sure the validator knows what the base URL is. Should be safe if in the same directory.

For more examples, if interested: see the W3C Schema Primer, linked to class web page.

Servlets for SOAP

We can do a lot with Servlets + SOAP -> basic SOAP Web Service

SOAP Client, pg. 142, like AdminTool of pa3: just uses JDK.

SOAP Servlet, pg. 146, also very primitive, but obviously we could do better.

The basic Web Service works OK between the two cooperating sites. (their developers in communication -> share DTDs, schemas etc).

Java EE support: SAAJ and JAX-WS

JAX-WS is layered on SAAJ, and does more for us, specifically it generates to and from WSDL, web service definition language, for standardized declaration of message formats, other things.

Let’s look at JAX-WS.

Like JAX-RS, JAX-WS provides a servlet, so we don’t have to code doPost ourselves, just put the right entries in our web.xml to get the servlet going.

We use annotation on server methods called on by SOAP services. We use JAXB markup on POJO classes related to the XML messages.

Unlike JAX-RS and WADL, it is easy and commonplace to generate almost a whole service or client from WSDL, so WSDL is much more important than WADL.

Like WADL, WSDL includes or points to XML schema(s) describing the XML messages needed for the service, that is, the payloads inside the SOAP envelope.

Almost the whole service, including JAXB POJOs for business objects, can be generated from the WSDL file. Of course we have to program the actual actions done by the service, by filling in code in provided stubs.

Similarly, client stubs can be generated from the WSDL.

WSDL doc contains

--XML Schemas for message bodies

--abstract SEI: service endpoint interface, i.e., what the calls look like

--spec. of SOAP, HTTP, endpoint address for services

Usually have one service URL for multiple WSs, usually discriminated by top-level element name in request body XML.

Once we have the WSDL doc, we can use tools to help build server-side and client code.

The tools allow the client and server to use the SEI as a Java API.

XML in DBs and XML Generated from Relational DBs

We have been working with 3 kinds of data: Draw pic with three areas, arrows between representing translation of data between them:

XML: triangles, Java objects: ovals, DB tables: rectangles

All three are considered very different from each other, with an “impedance mismatch” between them, i.e. it’s hard to translate data from one form to another.

We have studied the transition XML data ß> Java objects the most: parsing into our own objects with SAX, parsing into a DOM tree, and back out (via XSL, recall trick), more recently data binding with JAXB and its annotations in the POJOs.

CS636 studies Java objects ß> DB data using JPA-2, Java Persistence Architecture. It uses annotations in the POJOs too.

Web app layers:

· Presentation: handling XML to/from web services

· Service: doing core app programming with POJOs

· Data access: getting data to/from DB

Here we need the POJOs to keep the programmers happy. We can use JPA-2 to turn DB data into POJOs, and JAXB to turn POJOs into XML. It’s even possibly to annotate the same POJO two ways, one for JAXB and one for JPA-2, since each annotation processor only looks for its own annotations.

Data-driven Web Services

But what if we don’t need Java programming for our web services? In some pure data-driven cases, the desired web service data is determined completely by the database data.

Publishing Database Data Directly in XML, with no intervening application

On pages 203-208 of the text there is code that manually produces XML from database tables, but this is ugly

Today, many databases know about XML and can generate XML from relational tables

The SQL/XML standard is supported Oracle and DB2 UDB, among others, but not SqlServer, which has SQLXML, something similar.

SQL/XML was supposed to be part of SQL 2003, but they dropped it, apparently because it wasn’t integrated properly with XQuery yet.

Consider the following table structure

Dept (table name)

-----------------------------------------

dname | location (column names)

-----------------------------------------

Accounting | New York (row data)

-----------------------------------------

Operations | Boston

Easy to set up on our Oracle running on dbs2:

SQL> create table dept(dname char(20), location char(20));

Table created.

SQL> insert into dept values ('Accounting', 'New York');

1 row created.

SQL> insert into dept values ('Operations', 'Boston');

1 row created.

Here is an example SQL/XML query;

select XMLElement ("Department",

XMLElement("DeptName", dname),

XMLElement("Location", location))

from dept

The above SQL statement would generate the following XML:

<DeptName>Accounting</DeptName>

</Department>

<DeptName>Operations</DeptName>

<Location>Boston</Location>

</Department>

Actual output from SQLPlus—truncated, need better settings in SQLPlus to see everything

<Department><DeptName>Accounting </DeptName><Location>New York

<Department><DeptName>Operations </DeptName><Location>Boston

Of course there is more to this—see Oracle docs.

XML stored in a relational database

It is also possible to hold XML *in* the database, specifically in column values, using a new datatype for XML documents. Oracle calls this new datatype XMLType, DB2 just XML. We can have ordinary columns, like varchar and float and int, and then one or more columns that hold whole XML documents.

Then ordinary SQL can’t be used on the XMLType column value, but XPath can.

From notes15--

Another example of XPath use: in databases

Databases can now handle XML along with their ordinary relational data. A very useful way is by defining an XML datatype, so that a single column value can hold a whole XML document. Then a certain order, for example, can have a row in the orders table with relational columns order_id, qty, price, etc. , and also (say in column “req”) the XML document that came in from the web to make the order.

Oracle has good XML support in this way (since v 9.2 I think). Once an XML document is held in a column value, it can be queried by the Oracle SQL function

extractValue(tab.col, ‘xpathexpr’)

where the xpathexpr selects a certain element or attribute node. There are other related functions as well.

For example the following could get the qty from the XML in the req column of orders:

select extractValue(o.req, ‘/order/qty’) from orders o

where order_id = 100;

This kind of extraction of information from the XML can be done to populate relational columns in the database from the incoming XML, while also preserving that original XML.

Using Namespace case

Oracle case: just use an optional third argument for extractValue:

select extractValue(o.req, ‘/ord:order/ord:qty’,’xmlns:ord=”http:/…”’) from orders o

where order_id = 100;

The third argument sets up ord: as the prefix for the xpath expression. Multiple prefixes can be set up by separating them by spaces in the string.

Now to flesh that example out, at least the no-namespace case. The following example is from

Getting into SQL/XML by Tim Quinlan

Available at http://www.oracle.com/technetwork/articles/quinlan-xml-095823.html

1. Create a table with an XML column.

create table invoiceXML_col (

inv_id number primary key,

inv_doc XMLType);

2. or, Create an XML table.

create table invoiceXML_tbl of XMLtype;

We’ll stick to case 1. No reason to give up the option of having ordinary columns as well as XML ones. I’ve converted the examples from case 2 to case 1.

Add XML from a string:

Insert into invoicexml_col values (1, XMLType(‘<Invoice> … </Invoice>’));   --specify XML with string

Or add XML from a file (XMLDIR is a directory alias set up previously, see tutorial)

Insert into invoicexml_col values (1,

XMLType(bfilename('XMLDIR', 'invoicexml.txt'),

nls_charset_id('AL32UTF8') )); --I think this is the default charset, a UTF-8 descendent

Here is the file (no <?xml prolog, but it works)

<Invoice>

    <MailAddressTo id="PA">

        <Person>Joe Smith</Person>

        <Street>10 Apple Tree Lane</Street>

        <City>New York</City>

        <State>NY</State>

        <Zipcode>12345</Zipcode>

    </MailAddressTo>

    <MailAddressFrom id="PA">

        <Person>Ed Jones</Person>

        <Street>11 Cherry Lane</Street>

        <City>Newark</City>

        <State>NJ</State>

        <Zipcode>67890</Zipcode>

    </MailAddressFrom>

    <Details id="2006Sept1to30PA">

        <FromTo>Sept 1, 2006 to Sept 30, 2006</FromTo>

        <Hours>70</Hours>

        <Rate>30</Rate>

        <Taxes>210</Taxes>

        <TotalDue>2310</TotalDue>

        <InvDate>Oct 1, 2006</InvDate>

        <Contractor>Ed Jones</Contractor>

    </Details>

</Invoice>

Using XPath on this data—

select extract(inv_doc, '/Invoice/MailAddressTo') from invoicexml_col;

EXTRACT(INV_DOC,'/INVOICE/MAILADDRESSTO')

<MailAddressTo id="PA"><Person>Joe Smith</Person><Street>10

Apple Tree Lane</Street><City>New York</City><State>NY</Stat

e><Zipcode>12345</Zipcode></MailAddressTo>

Select count(*) from invoicexml_col

where existsNode(

inv_doc, '/Invoice/MailAddressTo[Person="Joe Smith"]') = 1;

COUNT(*)

With XML in the database, you can register XSD for it, that is, in our case associate a schema with invoicexml_col.  This XML has no namespace, but it is possible to work with namespaces too, even multiple schemas and namespaces.

An XMLType column can only hold well-formed XML in each column value: single root element, etc. With a registered schema, it can only hold valid documents.

Oracle docs point out that XML Schema and database schema can work together to keep data conforming to rules.  XML Schemas can’t by themselves enforce unique keys or relational integrity, but the database can.

DB2 and MS SQLServer have similar capabilities.

MySQL: behind the others for XML, but free and wildly popular. Can do boilerplate XML output:

sf08$ mysql -u eoneil1 -D eoneil1db -p --xml -e 'select * from customers'

Enter password:

<?xml version="1.0"?>

<resultset statement="select * from customers

" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

  <row>

        <field name="cid">c001</field>

        <field name="cname">Tiptop</field>

        <field name="city">Duluth</field>

        <field name="discnt">12</field>

  </row>

  <row>

        <field name="cid">c002</field>

        <field name="cname">Basics</field>

        <field name="city">Dallas</field>

        <field name="discnt">12</field>

  </row>

</resultset>

mysql has an ExtractValue function that works on string data using XPath:

mysql> SELECT ExtractValue('<a><b>cc</b></a>', '/a/b');

+------------------------------------------+

| ExtractValue('<a><b>cc</b></a>', '/a/b') |

+------------------------------------------+

| cc |

+------------------------------------------+

1 row in set (0.00 sec)

Similarly, put this XML in a string column named col1 and

mysql> SELECT ExtractValue(col1, '/a/b') from T;

to see the same result

So you can store XML in string columns and query it with XPath in mysql too.

There is also an UpdateXML function that can replace parts of XML.

Oracle can update XML with its UPDATEXML function: example from Oracle docs--

SELECT warehouse_name,

   EXTRACT(warehouse_spec, '/Warehouse/Docks')

   "Number of Docks"

   FROM warehouses

   WHERE warehouse_name = 'San Francisco';

WAREHOUSE_NAME       Number of Docks

-------------------- --------------------

San Francisco        <Docks>1</Docks>

UPDATE warehouses SET warehouse_spec =

   UPDATEXML(warehouse_spec,

   '/Warehouse/Docks/text()',4)

   WHERE warehouse_name = 'San Francisco';

1 row updated.

SELECT warehouse_name,

   EXTRACT(warehouse_spec, '/Warehouse/Docks')

   "Number of Docks"

   FROM warehouses

   WHERE warehouse_name = 'San Francisco';

WAREHOUSE_NAME       Number of Docks

-------------------- --------------------

San Francisco        <Docks>4</Docks>

Last topic: XML from DBs to Java via JDBC

Java 6 has SQLXML classes to help with newer JDBC (JDBC 4.0) accessing DBs that are storing XML in column values.

Supported in DB2 (DB2 v 9.5 (2007) or later), and MS SqlServer (2005 or later) for several years and more recently in Oracle 11.2 (2009)—we have v 10.1, unfortunately.

Mysql—can store XML in string types and use SQLXML in Java.