CS639 Homework 1 Web basics, XML Validation, Intro to ant, tomcat

Due Thurs., Feb. 7, in class, on paper

1 (optional—just do this if you need the review) Basic HTML and HTML links.  

·          Learn about absolute and relative URLs, say by looking at this tutorial on URLs

·          Study the HTML tutorials linked to the class web page.  We are not studying "presentation", that is, the details of how a page looks to a user.

Use a plain editor like emacs to compose a web page test1.html with page title "Mytitle", contents entitled "Important links", and a relative link to a copy of this file hw1.html in the same directory as test1.html, and an absolute link to the root of our departmental website, with appropriate descriptive text for the user to see. Then have a link to Google labeled "latest Java XML news" that searches for "java XML news".  The easy way to do this is to use Google interactively and copy the URL from the browser's address window.  But simplify the query string in it (after the ?) down to the minimum that has these keywords and still works.   Include the text of test1.html in your homework submission.

2. (optional—just do this if you need the review) Review Java Collection classes.  Look at the Collection Framework home page at Sun, and from there the Collections Framework Overview and Collections Framework Annotated Outline docs at Sun, for Java 1.6.1.  If you have been using Java 1.4, note that with Java 1.5/1.6/1.7, aka Java 5/6/7, we can (and should) use generics such as
List<Integer> numbers = new ArrayList<Integer>();  In Java 1.4, we previously put List numbers = new ArrayList(); and hoped we don't stick something other than an Integer in numbers by mistake.  If you are new to Java 5/6/7, or need a brush up, read the Tutorial linked to that same Collection Framework home page.
a.  What are the two most important concrete classes that are available in the JDK for the Set interface?  the Map interface?
b.  What is the immediate superclass to HashMap?  Can it be used with "... x =  new ..." to create a new object?  Explain your answer.
c.  Explain how you can find all the elements of a given Set object.  Does your answer also apply to Lists?  other things?  what class of objects?
d.  Explain how you can find all the keys of a given Map.
e.  Consider a certain Set object s, with elements e1 and e2, and another object x of the same type as e1 and e2. What tests on x vs. e1 and e2 determine whether x is considered to be in set s or not? In particular, what element-class methods are called by the Set implementation code to make this determination.
f.  Write a Java fragment that creates a Map from String to Integer. Add the association "x" -> 1.

3. XML Well-formedness. Find the error in $cs639/campus-not-well-formed.xml and describe it.  Hint: try to display it in a browser.  Fix the error and show a snippet of XML around your fix in your homework paper.

4. XML Validation. Login in on our Linux host and copy everything from $cs639/validate to your own cs639/hw1/validate ("cp -r $cs639/validate ." while cd'd to hw1). Recompile Counter.java as a check of your Java setup. See README there for some useful info.
a. Run Counter on each of the 6 *greeting*.xml files in validate, using appropriate flags for each, and report the flags used and the output. “Appropriate flags” means none if the XML has no linkage to DTD or XML Schema, “-v” if there is linkage to DTD, and “-s –v” if there is linkage to XML Schema.
b. Run Counter on the other invalid greetings.xml displayed in Chap. 20 of the XML Bible--call them invalid_greetings1.xml and invalid_greetings2.xml.
c. What does this validator report for campus-not-well-formed.xml?

5.  (optional—just do this if you need the review) Start learning or reviewing ant. First make sure you understand command-line use of javac and java with packages, by reading this Packages tutorial. Read the ant tutorials linked from the class web page.  
a.  In the Hello World tutorial, the javac and java commands are set up for the case that the user is cd'd to the project base directory, where src and build appear as subdirectories. Suppose the user cd's to the src directory: what are the corresponding javac and java commands in that case?
b.  Find and read the details on the ant delete task at the site where the tutorial resides.  What does the line <delete dir="build"/> do?  What is the corresponding command on UNIX?  on Windows?
c.  Modify the build.xml of the first example in the Hello World tutorial for "oata.HelloWorld" to be for the same java file (except for the package declaration) but now made to be in package "com.oata", following the usual convention that the package name is the site name in reverse order.  (Only one tiny change is needed, showing the ease of refactoring this way with ant.)  What is the new location for the source file?

6. Give a quick report on your software installation for this course.
a. Have you done the setup for Linux, or optionally Linux/UNIX? (get cs639 account, test java and ant, define $cs639 and test it, get rid of any lingering CLASSPATH definitions) Report any problems.
b. Have you installed the Java6/7 JDK on your home PC? Any problems?
c. Have you installed eclipse JEE on your home PC? Any problems?
d. Have you set your user environment variables JAVA_HOME and ANT_HOME, and added their bin directories to Path?
e. Did all the tests pass?

7 (Optional, if needed…) HTTP.a. Use a browser to look at the tiny HTML test page at www.cs.umb.edu/cs639/test.html. Give the connection (server, port) and GET command that was issued by your browser.
b. At the command line in UNIX or Windows, do the command "telnet www.cs.umb.edu 80" to connect your keyboard and screen to our departmental web server, which runs on host www.cs.umb.edu on TCP port 80, the normal HTTP port.  You will get no output from it immediately. Instead, it is waiting for your request.  Type "GET /cs639/test.html  HTTP/1.0" followed by two carriage returns. (Use HTTP 1.0 even though browsers use HTTP 1.1, so the web server expects less from you.) You may have to type this without seeing anything on the screen--after all this is set up to talk to programs, not real users.  The second carriage return (making a blank line) tells the web server that you are done with the request.  Then it will return the HTTP response: header followed by the contents of the test page, and then drop the connection.  Capture the output and record it in your homework paper, including the message about the connection going away.  Indicate the header and contents.
c.  Note that HTTP is stateless.  Once the HTTP response is sent off, the web server forgets all about that request and goes on to the next one.  List the sequence of server, host connections and HTTP requests and responses that happen when a browser goes to access a static web page (imagined) at www.cs.umb.edu/cs639/test2.html with two images with relative URLs image1.jpg and image2.jpg.

Tomcat and needed tunnels Tomcat is the servlet-capable web server that we will be using to execute web applications.  I'll leave my installation of tomcat running on users2.cs.umb.edu (on port 11600) for your experimentation. To access this port from home, you need to set up a putty tunnel: see Setting up putty tunnels. You can look at this tomcat's files at ~eoneil/cs639/tomcat-6.0 in the UNIX/Linux filesystem. My directory ~eoneil/cs639 has the tomcat installation, made following the student instructions. In other words, I'm pretending to be a student in the class, but letting you see the results, whereas when you put subdirectories under your own cs639 directory, noone else in the class may see them.  IMPORTANT: Don't ever change permissions on your cs639 directory.

8.  Try out tomcat, accessing just HTML pages to start.
a.  Browse (via tunnel) to http://users2.cs.umb.edu:11600  to see the "root" page. This means using URL http://localhost:11600 in your browser, with the tunnel set up to deliver this TCP stream to port 11600 on users2.cs.umb.edu. You should see a picture of a tomcat and some text about the Apache Tomcat project.  Also links, including to some JSP examples of interest.  Then browse to http://users2.cs.umb.edu:11600/cs639/index.html to see my little index.html page--you will be making a similar one for yourself.  This file index.html is situated at file path  ~eoneil/cs639/tomcat-6.0/webapps/cs639/index.html.  You are welcome to look at it in the UNIX/Linux filesystem   The webapps directory is the root directory of this website served by my tomcat. That's why the URL, http://users2.cs.umb.edu:11600/cs639/index.html uses the part of the file path after webapps, the local path.  Give the UNIX command you used to display this file while cd'd to your own login directory.  Also the UNIX/Linux command to display the test page in the class home page discussed in problem 7.
b. Use telnet to access the same file "telnet localhost 11600", followed by "GET /cs639/index.html HTTP/1.0" and two carriage returns and record the response.