CS639 Syllabus, Spring 2013

XML and Semi-structured Data

Professor: Betty O’Neil
Class meets  MW 4:00-5:15 in M-2-214

Office Hours: MW 2:30-3:30, 6:15-6:45 in S/3/169

Prerequistes
Significant Java experience including use of the Java Collection classes, and one of CS451/651, CS636, or CS437/637. (or other compiler-related or database application experience, or CS420 or CS450 or CS430/630)

Textbooks:
1. Mostly for the first part of the term:

Processing XML with Java, by Elliotte Rusty Harold, Addison0Wesley, ISBN 0-201-77186-1. Available free at http://www.cafeconleche.org/books/xmljava/, but worth paying for in hardcopy (1071 pages!)

2. Mostly for the later part of  the term, but has useful intro topics:

REST in Practice, by Jim Webber, Savas Parastatidis, and Ian Robinson, O'Reilly, ISBN 978-0-596-80582-1 (at Amazon)


NOTE: Get a UNIX account for cs636 by running apply, even if you already have a UNIX account here.  See the class web page at www.cs.umb.edu/cs639 and follow the link to the Software Development Setup for UNIX and your home PC.

NOTE: Although the course title mentions semi-structured data, we will not be covering semi-structured data explicitly, only in the sense that XML can handle it, so knowing how to use XML means being able to handle semi-structured data when you need to.

Since there is no web technology prerequisite, we will study the its basics along with the XML coverage. 

Topics
We will be following the Harold book (text #1) as much as possible, with additional coverage of Web Services (mostly "RESTful web services"), using Webber et al (text #2).

0. Introduction. Basic ideas of XML for portable data, Web services (both classic SOAP and lighter-weight REST), loosely coupled distributed computation.  Network and web programming basics: client-server, server-side vs. client-side programming.  The browser as the universal client for web users.  Data storage and transport: XML vs relational database tabular data vs JSON (now commonly used with Ajax)

1. Basic XML: XML Documents (Chap. 1 of Harold), validity, DTDs and XML schemas.  Example of build.xml of ant. Simple XPaths.Web technology: URLs, URIs (Chap 1 of Webber et al), HTML links and forms, web servers providing HTML pages via HTTP GETs from a browser.

2. XML Protocols (Chap. 2 of Harold): RSS, Atom (not in Harold, see Chap 7 of Webber et al), SOAP, the underpinning of classicWeb Services. REST has no protocol of its own, i.e., it uses the HTTP protocol directly. Writing clients for provided web services. Web tech: HTTP GETs and POSTs, PUTs, etc. idea of web servers providing dynamic HTML and XML as well as static HTML, specifically servlets, the J2EE way to provide dynamic web output, using tomcat (a web server that can host servlets).

3. Writing XML with Java, (Chap. 3, 4 of Harold), character encodings, Web tech: installing and running your own tomcat server. Basics of RESTful web services (Webber et al Chap. 2) Using POX, plain old XML delivered by servlets (Webber et al Chap. 3)

4.  Reading XML: SAX (Chap. 5-8 of Harold)  Web tech: CRUD Web Services and WADL (Webber et al Chap. 4), intro to hypermedia protocols (Webber et al, Chap. 5)

5. XML as trees of objects in memory: DOM and JDOM  (Chap.9-15 of Harold).  Web tech:  implementing web services (SOAP and RESTful) running on tomcat, using JAX-WS (by example) and JAX-RS (Webber et al, Chap. 5), and their clients.

6. Querying XML: More on XPath (Chap. 16 of Harold)  Web tech: Describing Web services with WSDL.

Grading: simple point system

Midterm: 100 points, Final: 150 points, Assignments: various, about 150 points total

ACCOMMODATIONS:
Section 504 of the Americans with Disabilities Act of 1990 offers guidelines for curriculum modifications and adaptations for students with documented disabilities. If applicable, students may obtain adaptation recommendations from the Ross Center for Disability Services, M-1-401, (617-287-7430). The student must present these recommendations and discuss them with each professor within a reasonable period, preferably by the end of Drop/Add period.

STUDENT CONDUCT:
Students are required to adhere to the University Policy on Academic Standards and Cheating, to the University Statement on Plagiarism and the Documentation of Written Work, and to the Code of Student Conduct as delineated in the catalog of Undergraduate Programs, pp. 44-45, and 48-52. The Code is available online at: http://www.umb.edu/life_on_campus/policies/code/