cs634 hw6

CS634 – Homework 6

Docker containers, Crash Recovery, Multidimensional Data

Instructions: The homework is due in class on Monday, May 7. Please prepare paper copies (either typeset or hand-written copies are fine, as long as the hand writing is clear). 30 points (5 points each).

Part 1 Cloud Technology: Docker containers

First watch Jenny Fong's video Docker 101, at least to minute 32, paying particular attention to what the developer needs to do. As director of product marketing at Docker, Inc., she is trying to sell you, but she does know the tech aspects. Note these topics:
--virtualization vs. componentization
--what's in a container (everything an app needs except the kernel), examples
--idea of Docker images as files holding enough data to create a container
--running docker as a developer (on her Mac, we'll use our Linux cloud VMs): pull image, use it to create a container, etc.
--how images can be moved from developer environment to production environment, including cloud

Then study https://cloud.google.com/compute/docs/containers/ but don't worry about Kubernetes: it's for production use of containers, or LXD, just concentrate on Docker. Just follow the link about installing Docker on Linux and install it on your Linux VM. Use the stable community edition. You can go directly to https://docs.docker.com/install/linux/docker-ce/debian/ for this. Step 4 runs hello-world: Here is my output for that:

eoneil@eonvm:~$ sudo docker run hello-world Unable to find image 'hello-world:latest' locally latest: Pulling from library/hello-world ca4f61b1923c: Pull complete Digest: sha256:97ce6fa4b6cdc0790cda65fe7290b74cfebd9fa0c9b8c38e979330d547d22ce1 Status: Downloaded newer image for hello-world:latest
Hello from Docker! This message shows that your installation appears to be working correctly.

1.a. Report on any problems you had in installing Docker on your VM. Show that it worked by showing the output from running hello-world in step 4, like the above output. Note that you need to use sudo with the docker commands on these VMs.

b. Create a directory ~/docker/hello on your cloud VM for this work. For a Java Hello World example, we can put a JDK in a container and use it. See https://hub.docker.com/r/frolvlad/alpine-oraclejdk8/ for a slimmed-down JDK for containers and its two-line Hello World example. The first line just writes Main.java as a hello-world program, in the current directory. The second line runs the container (which gets created from a downloaded image on the way) with "--rm" to clean up the container at exit and -v to give the container file access to the current directory (with path given by output of pwd command). Inside the container the current directory is seen as "/mnt", and by the --workdir, it is the current directory of the in-container process. The second line goes on to run the shell sh inside the container (as that top-level in-container process) with command "javac Main.java && java Main" to compile and run Main.java. Again add "sudo" to this command and show the output on your VM. Then use "sudo docker images" to list images: should see "hello-world" and "java". Rerun the java image by its local name "java".

Here is a command I used to use the JDK image (local name java) to just compile Main.java:

sudo docker run --rm -v "/home/eoneil/docker/hello":/mnt --workdir /mnt java javac Main.java

Modify this to your setup and show that it works by deleting Main.class before using it, displaying it with ls after. Similarly compose a similar container command to run Main given Main.class and show it in your paper.

Note that when we use sh along with javac and java as specified by the container command above, we are actually using two processes at once inside the container, the shell as top-level process and one of the java tools as a child process. Normally there is only one process active in a container. In a sense the shell is so lightweight as not to count as a serious process. It's not that a container can't do multiple processes, it just normally has a single job to do.

c. It's not usual to simply containerize the JDK, as done in part a. More commonly, a Java app is containerized along with its needed JDK. The usual way to do this is with a Dockerfile. See this Docker tutorial for the Dockerfile for another single-file Java app named PingPong.java. Copy that Dockerfile to your hello directory and change all the PingPong strings to Main in the Dockerfile. You can also delete the EXPOSE line, since we are not using TCP connections here. Then build the new containerized hello app by "sudo docker build -t hello ." The dot is for the current directory, where Dockerfile resides. The option -t gives a tag "hello" to the image, so it can be run with "sudo docker run hello", or to clean up the exited container, "sudo docker run --rm hello". We no longer need the -v option because the Dockerfile arranges to copy the Main.java into the container. Also, we are now using a full JDK, not the slimmed-down one. Note the sizes listed by "docker images" in your homework paper, as well as your Dockerfile and output of a run.

2. a. Create a directory ~/docker/pingpong and put PingPong.java and its Dockerfile there. Build and run it as explained in the tutorial. You may need to install curl to do this. Show your commands and their output in your paper. This example is a step closer to a "real" app because it communicates with a TCP connection instead of a terminal. Most real containerized apps take input and send output via TCP stream connections. Only debug output might go to the terminal.

b. Change PingPong.java to accept requests to /ping as originally and also to /ping2, for which it sends back "pongpong". You see it's pretty easy to have one server handle multiple URLs. Show your code and its testing in your homework paper.

3. We know that mysql uses a TCP connection to communicate with its client, so it can be usefully containerized. Docker images for mysql are available for easy download. In fact, we can try "sudo docker run --name=test-mysql mysql" and the image download will happen as well as the container build, but the run will fail because no root password was given. See https://severalnines.com/blog/mysql-docker-containers-understanding-basics for how to do it successfully.

a. Try out the steps in that tutorial, at least to include the line "mysql -uroot -pmypassword -h 172.17.0.20 -P 3306" that connects to the containerized mysql. Sentence added 4/25 (and see more below): But specify version 5.7 mysql by using "sudo docker run --name=test-mysql mysql/mysql-server:5.7", to avoid compatibility problems with the mysql command we already have on our systems. Note you shouldn't have to install mysql-client because you already have a working mysql client command. It's OK to let the mysql/MariaDB server on your system keep running, because the container's mysql is in its own world, on a different network. Sentence added 4/24: As shown in the slides for class 19, you can find the IPAddress for your own container's network by first finding its id from "sudo docker ps" and then doing "sudo docker inspect <container-id>". The local Linux knows about this network, (that's why mysql -h 172.17.0.20 works in the tutorial) but doesn't let other systems know about it, unless you arrange it via docker commands. The commands to make port 6603 work externally are shown in the tutorial (after the mysql command), but we don't need this capability, and we would have to open up the port in Google's firewall as well to do it (possible, but a nuisance). While the container is still executing, try " ping <ContainerIPaddress>" from a shell on your VM.

Added 4/26: The use of "mysql/mysql-server:5.7" in the first command makes the image name "mysql/mysql-server:5.7" for commands after that. If you want a shorter name, you can use the docker tag command to attach a second name to the image, for example "sudo docker tag mysql/mysql-server:5.7 mysql57". Then the second run command in the tutorial is "sudo docker run --name=test-mysql --env="MYSQL_ROOT_PASSWORD=mypassword" mysql57. This should work as discussed, but similarly hangs up the terminal. The third command uses --detach (or -d for short) to run mysql as a daemon. Then find its IP address as shown. However, this mysql server does not allow root to login from a remote host (note that the main host is remote to the containerized mysql), showing error (for example) "ERROR 1130 (HY000): Host '172.17.0.1' is not allowed to connect to this MySQL server" IP address 172.17.0.1 is the Dockernet address of the main host (You can see all the host's IPs using command /sbin/ipconfig). A search on this error string yields the solution at StackOverflow. The problem is that the built-n root user is only allowed to login from local mysql clients, in this version of mysql (a tighter security measure). To fix this, we need to login once and allow root to login from anywhere. We can login using the mysql command inside the container using the docker exec command and fix up the root user account. Here is a typescript: Use your own container id of course. After this, you should be able to login using the host's mysql. You can save the fixed-up container as an image with "sudo docker commit mysql57-fixed".

eoneil@eonvm:~/docker/util$ sudo docker exec -it 983cb5845af3 bash bash-4.2# mysql -u root ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO) bash-4.2# mysql -u root -p Enter password: (typed mypassword here) Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 28 Server version: 5.7.22 MySQL Community Server (GPL) <--Check this: we don't want version 8! Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> use mysql

exit eoneil@eonvm:

Now you should be able to "mysql -h 172.17.0.x -u root -D mysql -p" from the host's shell, where -D mysql is there to override your ~/.my.sql file's specified database, if any, and x is found by inspect of the container for its IPAddress.

ERROR 1130 (HY000): Host '172.17.0.1' is not allowed to connect to this MySQL server

sudo docker tag mysql/mysql-server:5.7 mysql5

sudo docker tag mysql/mysql-server:5.7 mysql5

b. In the same tutorial, skip down to the section on Data Storage. As set up above, the data is trapped inside the container, so if the system crashes, so does the data. Read about how to use -v (or --volume, same option) to get the containerized mysql to use files on the host for its data. In the tutorial example, the internal /var/lib/mysql is referencing the external /storage/docker/mysql-datadir. In the homework, give the command to run a containerized mysql using /data/mysql/datadir for data on the host, with no external port exposed, and using the default configuration.

4. Optional challenge. What we would like to see is a containerized Java program using JDBC to connect to a containerized mysql. There is a tutorial showing how to do this. See https://www.linkedin.com/pulse/running-java-application-mysql-linked-docker-deepak-sureshkumar/. Among other things, it shows how to get the mysql jar file onto the command line for a container (recall we need to run a JDBC app with "java -cp mysql-connector-version.jar:. JdbcCheckup"). Added 4/26: It turns out you need a LinkedIn account to view this. Email me for a (rough) copy if aren't a member and want to see it.

sudo docker run hello

Part 2 Textbook Problems

Question 1 Transaction Rollback

Exercise 18.4 in the textbook. Read the solution to 18.3 to provide a model for your solution of part 2.

Question 2 Crash Recovery

Exercise 18.5 in the textbook. Read the solution manual for the most complicated case (two crashes during recovery) for format and getting started.

a. First show the recovery that would occur if there were no crashes during recovery.

b. Then show the recovery that would occur if there is one crash during recovery, after the first recovery wrote two records

Question 3 Multidimensional Data: Cross-tabs

a. Exercise 25.2. This pivot is to a cross-tabs display on pid and timeid, represented by their pnames and year numbers. Assume timeid=1 is 1995, timeid=2 is 1996, and timeid 3 is 1997, to be consistent with Figure 25.5, which is your model of the right format for the answer here. (For some reason, the time dimension table is not given in the book.)

b. Exercise 25.3 (parts 1,3)