Review for Midterm 

 

 

 

Core setup:

In the simple case (at most one schema), an XML document can have, optionally

XML   DTD

           -- XML Schema

           -- NS

 

We are assuming no NS if we have a DTD, since they play together poorly.

If there are both an NS and an XML Schema, they must be related by targetNamepace = NSURI in the XML schema to be useful. We also assume the schema has elementFormDefault=”qualified”.

 

Does it make sense to have a NS but no XML Schema? Yes, for marking XML proprietary, and for making sure there’s no name clashes with other XML vocabularies.  Even though there’s no schema document, there is usually an understood schema, what goes where in the XML.

 

Although the XML doc can specify the location of the DTD or XML schema, that is not required for their usefulness. We can specify the schema file locations externally.  In fact, specifying them internally (in the XML doc) is “brittle” (easily breaks when files are reorganized).  In the case of NS+XML schema, the NS itself identifies the relevant schema. The NS URI is not “brittle”, since it is not tied to a file location.

 

But this is assuming the XML is all described by one schema, or none at all. With multiple NS’s, we can assemble an XML doc with parts described by multiple schemas. However, we have not yet covered this properly, so don’t worry about it for the midterm exam. Our main example of this was the SOAP message, pg. 97.

 

 

See reading guide for book coverage

Additional notes as we went through the sections in the book:

 

pg. 40: here is the way to attach an XML schema when not using namespaces. We also covered the case with namespaces, but note “brittle” argument.

 

pg. 70: replace the loop over bytes with the loop over Strings of EchoHtml.java of servlet2. This will properly handle multi-byte UTF-8 codes.

 

Chap. 5: Intro to SAX, DOM, and JAXP

JAXP is the package that manages both SAX and DOM support. It provides top-level factory classes like XMLReaderFactory and DocumentBuilderFactory. Then once the XMLReader (SAX) or DOM Document is created, further processing is done in the SAX or DOM package.

 

Note that this chapter has the important basic examples for SAX and DOM: Ex. 5.3/5.4 for SAX, Ex. 5.6 for DOM.

 

pg. 231: drop the argument to createXMLReader: we are using the default JDK SAX parser.

 

Chap 7: Advanced SAX: you can skip this if you’re short on time.

 

Chap. 9: DOM. The notion of the DOM tree of Nodes is important, and is closely related to the XPath tree of nodes. You can start at pg. 440 and read to pg. 451 to get the basics on this. See the DOM Node/ XPath node table in class14.html.

 

We also looked at a DOM Example with namespaces. See DOMWithNamespaces.html

 

Chap 16 XPath: you should know how to do XPath queries using the abbreviated syntax, like //method, but you can ignore the syntax with double colons, like /child::weather.

 

 

Web Technologies

 

 

TCP / IP stream connection

 

Server is listening on a certain port, long lived process. Each system has large # (16 bits) of ports.

 

The client does a CONNECT action it has to do it on a certain system, on a certain port.

 

HTTP is a simple protocol, based on TCP stream connection, its reliability and flow control.

HTTP: we studied GET/POST.

Request and response both have HTTP headers,

HTTP headers—Content-Type most important for GET response and POST request and response, to specify form of body:

Content-Type: text/xml => UTF-8 by default

Content-Type: text/html => Latin-1  by default

We can override the default and specify charset here explicitly (example, pg. 99, Example 2.18)

 

First line of response has HTTP response code:

HTTP Response Code:

            200 – OK

            404 – NOT FOUND

            500 – SERVER PROBLEM (including servlet failures)

 

You can find some info in the text, pg 64-68, 81-84.

Idea of stateless HTTP server.

  

URLs for HTTP

http://hostname:port/uri?queryString

 

Query Strings; in URL for GET, body for POST

 

URL encoding--you do not need details.

 

 HTML:  – understand links, image, basic idea of forms

  

Important SERVLETS – how they work.  See servlet1 handout.

 

Tomcat works (Java Program) which provides its own JVM and this becomes a Web Container. This contains multiple servlets.

 

How does TOMCAT execute?

When a REQUEST comes in, TOMCAT is listening on the port, detects the new request, sets up the Request and Response objects. There are objects for this response request. It creates a thread for this request.

 

This thread calls our servlet code, with req & resp objects as parameters. We can use the request object to access info from the query string.

 

The response object that we get, we set this up by setting the content type -> default encoding.

 

XML implies UTF-8.

-         For output written to “out” object gotten from the response object.

To get local file, we needed a ServletContext object, i.e., we’re seeing some of the web-container support here.

 

How to set up the servlet DEPLOYMENT – handout on servlet1 example.

 

If we have Project xxx, we use xxx as the webapp name (our convention).

Deploy it with build.xml to the webapps/xxx area under tomcat. This is the deployment directory.

 

The webapps directory is the root of TOMCAT Website for our appt.

Typically have xxx/welcome.html as an entry point to the app: this gets served out by tomcat as a simple web server

Can also have xxx/yyy/something.html, or image files, etc., just like an ordinary website

 

Special subdirectory: WEB-INF and all its subdirs are hidden by tomcat from direct web service, so our servlet files cannot be accessed directly from the network.

webapps/xxx/WEB-INF/classes

 and its web.xml in webapps/xxx/WEB-INF

 

Also, our app owns a slice of this tomcat’s URL-space:   http://host:port/xxx/...

 

web.xml provides the mappings we need for our app, URLs within our URL space to our servlet.

In web.xml, we have servlet_name, the servlet class name and then in the servlet mapping we have the URL pattern. 

The servlet_name is the key that correlates the servlet_class information with the servlet’s url_pattern.

 

Build.xml is also important here. Note no schema for it!

 

In this class, we also talked about PAs and HW, which are included in the midterm.

 

PA1: using reflection to extract info about a Java source and writing XML using that info.

Don’t worry about the complicated extraction of type info in the solution.

Do understand the class and method properties extraction and XML representation, and how the recursion works.

Also how we were able to run Counter under our program

 

Servlet1 and 2: basic servlet technology

 

PA2: XML servlet.