CS451/CS651 Compilers
Bill Campbell
Spring 2008

This document is available at http://www.cs.umb.edu/~wrc/cs651/s08/.  On the department's Unix file system you can find the documents for this course at ~wrc/cs651.

Hi, my name is Bill Campbell. This course is about compilers, programs that translate programs written in some high level programming language like Java to some (low-level) target machine language like that for the Java Virtual Machine (JVM) or the Intel 386 family of computers.

Our study of compilers will involve both theory and practice. There will be many exercises similar to those found in CS 420. When it comes to theory we shall be following the textbook closely and we'll take many of our exercises from there. We shall also write many programs; indeed we shall write ourselves a compiler, albeit in pieces. We will start from a compiler for a subset of Java called j-- and extend that to add new language features.

Our compiler will translate programs written in a non-trivial subset of Java to .class files, which can be executed on the JVM. So that we don't have to deal with the intricacies of properly constructing .class files, we'll use a tool that Swami Iyer has implemented to create the class files; we call this the emitter.

An old friend, Bruce Knobe, has always argued that Compilers was always a software engineering course. In a sense it is since it's probably the first course in which each programming assignment must make use of the programs written in all previous programming assignments. Moreover we end up writing a program composed of many components. It is assumed that you are proficient in Java.

Like many programming courses, this one requires time. Indeed this course has a reputation for eating up lots of time. After all we're doing lots of theoretical exercises and writing a compiler! Do not be surprised if you find yourself spending 20 hours per week on this course. If you cannot spend a lot of time here then I suggest you consider withdrawing right now. If you stick it out, I think you will have a lot of fun. 

Syllabus

 

  • Introduction: compilers vs. interpreters, compiler structure. [Campbell, Iyer and Akbal, 1]
  • The Java Virtual Machine and its run-time environment.
  • A map of the j-- compiler.
  • Lexical analysis (scanning). Regular expressions, fsa and dfa. [Campbell, Iyer and Akbal, 3, Java Spec 3]
  • Context-free grammars and parsing. [Campbell, Iyer and Akbal, 4, Java Spec 2, 18]
  • Top-down parsing. Recursive descent. [Campbell, Iyer and Akbal, 4]
  • Bottom-up parsing. LR parsers. [Campbell, Iyer and Akbal, 4]
  • User a parser generator. JavaCC. [Campbell, Iyer and Akbal, 4, https://javacc.dev.java.net]
  • Semantic analysis. [Campbell, Iyer and Akbal, 5, Java Spec]
  • Code generation. [Campbell, Iyer and Akbal, 6, CLEmitter]
  • Code Optimizations [Campbell, Iyer and Akbal, 8]
  • Industrial Stength Compilers [Campbell, Iyer and Akbal, 9]
  • Wrap-up.

Text Books

Readings and exercises will be assigned from the following required text, which I will distribute in class.

  • William Campbell, Swaminathan Iyer and Ayse Bahar Akbal. Compiler Construction in a Java™ World.

Another book, which you may find useful, is

Sun Microsystems has a good set of online documents about Java, specifically

·         An index to the Java 2 Platform, Standard Edition documentation.

·         Java Language Specification, Third Edition

This will be our Bible when it comes to deciding how the various components of our compiler should deal with Java.

·         Java 2 Platform, Standard Edition, 6.0 API Specification

·         The Java Virtual Machine (JVM) Specification

Chapters of the text will be distributed in class.  These include:

·        Chapter 1: Compilation

·        Chapter 2: j--, the CLEmitter and the JVM Instruction Set

·        Chapter 3: Lexical Analysis

·        Chapter 4: Parsing

·        Chapter 5: Type Analysis

·        Chapter 6: Code Generation

·        Appendix A: Setting up and running j--

·        Appendix B: j-- Syntax

·        Appendix C: Java Syntax

·        Appendix D: The CLEmitter and the JVM instruction set

The j-- code tree is available at http://www.cs.umb.edu/~wrc/j--/j--.zip

Assignments and Grading

Your grade will be determined as follows:

  • Programming Exercises (the compiler) 60%

Two important things here:

            A narrative describing how you went about designing and writing your program, the problems you faced, alternative solutions, and a discussion as to why you chose the solutions you did should accompany every assignment. I should have a pretty good idea of what your program looks like from reading your narrative.

            Your work must be your own. If someone gives you an idea, acknowledge it in your narrative.

            Assignment 1: Compiling Additional Operators

 

Examples of good p1 narratives:

dusenbury-p1-narrative.rtf

ozual-p1-narrative.txt

peri-p1-narrative.doc

ward-p1-narrative.doc

            Assignment 2: Scanning Java Tokens

            Assignment 3: Parsing

            Assignment 4: Analysis and Code Generation I

            Assignment 5: Analysis and Code Generation II

             

·         Midterm Examination (Wednesday, March 26) (sample) 15%

·         Final Examination 25%

Reaching Me

My office is S-3-183; it is just behind the department office. My office hours are in my .plan. My office telephone number is 617-287-6449; my home telephone number is 617-547-2738. Please do not telephone me at home before 9am or after 10pm. My email address is william.campbell@umb.edu; I read my email regularly.

If you have a question about something you don’t understand outside of class, use the cs451/cs651 group in the CS Forums at http://forums.cs.umb.edu/forums/.  Perhaps one of your classmates will answer it.  Perhaps one of us (me or Swami) will.  If your question is of a strictly personal nature, e.g. about a grade or a request for a face-to-face meeting, by all means use email.  But beware: should your question be of the sort that the entire class may want the answer to, I’ll ask you to repost it on the CS Forums site.  Get used to this site; it’s a good way to hold discussions over the internet.

A Few Important Notes

  • I do not accept late programming assignments nor do I schedule make-up examinations.
  • I do not give incompletes, except for real emergencies allowed for by University policy.
  • I have no patience for plagiarism. Attempting to pass the work of others off as your own is a violation of Academic Standards; anyone found doing so will fail.
  • And, please turn your cell phones off.  If you must leave them on, because you have children in childcare, then put them on silent mode and go outside to answer them.

Should you have any problems or questions, contact me early; don't let small problems become big ones! Telephone me, visit me at my office or stop by after class to set up an appointment. Also, I encourage questions in class; if you don't understand something, there is a good chance that others don't. I like questions.