# The Course Syllabus

Lectures are split between traditional presentations, in which the instructor explains some topic of interest, and paper discussions, in which all the students (plus the instructor) debate some recent paper. The slides used in the traditional presentations are available in this page. The course is divided into two parts just because that is the easiest way to evaluate the students: we have two exams, one at the end of each part:

#### Calendar 2015

First paper discussion: April 24th, 2017
First review class: April 26th, 2017
First exam: May 3rd, 2016
Second paper discussion: June 19th, 2017
Second review class: June 21th, 2017
Second exam: June 26th, 2017

## First Part

Slides Exercises Description
Introduction: a bit of everything. Why to study compilers, the history behind them, how compiler optimizations work, etc, etc, etc. This class also gives a general overview about the entire course. In addition to the exercises, this time we have a few slides from the ACE group, a company that sells compiler technology!
Control Flow Graphs: in this class we describe the most fundamental program representation used by the compiler: the control flow graph. We define basic blocks, and go over local optimizations: code optimization techniques that work exclusively within the boundaries of single basic blocks. Additionally, we show how to use the LLVM compiler to carry on simple optimizations.
Introduction to Dataflow Analyses: in this class we review the basic principles that support dataflow analyses. We take a look into the different types of analyses, e.g., may/must and forward/backwards. The class is mostly based on examples, which illustrate these principles and differences.
Algorithms to Solve Dataflow Analyses: in this class we go over different strategies to solve dataflow analyses efficiently. We study the worklist algorithm, which is a baseline design that can deal with general dataflow analyses, and then refine this design, observing how a better ordering of the constraints to be solved can have a positive influence on the performance of the algorithm.
Introduction to Lattice Theory: in this class we take a look into lattices. Lattices constitute the core of the theory supporting dataflow analyses. We discuss why the worklist algorithm seen previously terminates for many different types of dataflow analyses. We introduce and explain the notion of fixed points, and show why it is so fundamental to static programa analyses.
Partial Redundancy Elimination: in this class we go over this classic compiler optimization, focusing on lazy code motion. Partial Redundancy Elimination (PRE) has the virtue of subsumming many different dataflow analyses, joining in a single algorithm the effects of different compiler optimizations.
Constraint Based Analyses: in this class we discuss a different type of static program analyses: the contraint based approach. Analyses that fit into this category can be used to deal with problems that do not assume any ordering between the statements of a program. A typical example of such a problem is the inference of a control flow graph for a program coded in a language with dynamic dispatch.
Pointer Analysis: this class presents one of the most important and well-know types of constraint based analyses: the pointer analysis. We illustrate the need for knowing information about pointers via examples, and show why it is so hard to solve pointer analysis using traditional dataflow approaches. We then move on to explain two different algorithms to solve this kind of problem, paying particular attention on the need to find cycles between constraints.
Loop Optimizations: in this class we go over some techniques used to speedup loops, notoriously the most critical part of any program. We show how to identify loops, and nested loops, and introduce the notion of dominance between basic blocks. We then go over simple optimizations, such as strength-reduction, loop unrolling and code hoisting.
Static Single Assignment Form: this class introduces the most important intermediate program representation used in compilers today: the SSA form. Virtually every relevant compiler that has been implemented in the last 20 years uses this intermediate representation. It is so important because it simplifies considerably the design and implementation of many compiler analyses and transformations, such as dead-code elimination and constant propagation.
Sparse Abstract Interpretation: in this class we generalize the concept of Static Single Assignment to Static Single Information. Our goal is to show to the student how to convert a program to an intermediate representation that facilitates backward and forward flow analyses. These representations provide the property that the information associated with a variable is invariant along this variable's entire live range.
Tainted Flow Analysis: this class introduces the students to the notion of tainted flow attack. Tainted flow attacks include a vast range of program attacks, such as SQL injection, cross site scripting and remote command execution. This class lets us show how compilers can be used to secure programs against malicious users. Additionally, it allows us to compare a dense and a sparse approach to static program analysis.

## Second Part

Slides Exercises Description
Range Analysis: This class describes range analysis, a technique to estimate the lower and upper limits that integer variables may assume during the execution of a program. Range analysis is useful in several ways: it helps to eliminate array bounds checks, to perform dead code elimination and to find security vulnerabilities in programs. We describe a complete algorithm, which uses state-of-the-art techniques to implement the analysis.
Program Slicing: this time we will talk a bit about program slicing, a technique that supports software engineers in finding bugs, and understanding programs. We shall see what is a program slice, and how to build it. We will close on a few notions, such as data and control dependences, paying particular attention to a fast way to detect the latter. A great deal of time will be spent in proving the correctness of a core algorithm. We will also discuss an SML implementation of this algorithm.
Operational Semantics: in this class we introduce the student to the notion of operational semantics. Operation semantics is one of the key tools that the compiler writer has to reason about program behavior. We discuss the notation of inference rules, and show how simple proofs of properties can be mechanically constructed from formal specifications.
Type Systems: in this class we introduce a different kind of technology used in program analysis: the type systems. We explain the concepts of progress and preservation, which, together, we can use to prove that a programming language is free of some form of runtime errors. We pay special attention on the basic techniques to prove properties about type systems, and how these techniques can be used mechanically.
Proving Metatheorems with Twelf: this class introduces the students to Twelf, a framework that supports the development of logical systems. We will go over basic techniques to describe the syntax and the semantics of a few basic deductive systems. We will also see some basic proof techniques that one can use in Twelf. In particular, we will mechanize the proof of preservation, seen in the previous class.
Just-In-Time Compilers: Here we introduce the student to the concept of just-in-time compilation. A JIT compiler generates code for a program while this program is being executed. In this class we go over speculation and specialization, two key techniques used to optimize code at runtime. We analyze two speculative techniques, class inference and parameter specialization. We close the class by discussing the notion of a trace compiler.
Register Allocation: this is likely the most important classic compiler optimization. In this class we will see two different register allocation algorithms: linear scan and iterated register coalescing. We will see some NP-completeness results, and will discuss also coalescing, an optimization that is tightly coupled with register allocation.
SSA-Based Register Allocation: in this class we go over one of the latest developments in compiler theory: the fact that the interference graph of SSA-form programs are chordal. We will see why this property always holds, and will take a look into a few algorithms that can find the chromatic number of chordal graphs in polynomial time. We will also revisit spilling and coalescing in the context of chordality.
Correctness: this class combines register allocation and type systems. Although seemingly unrelated, type systems can be used to verify the correctness of the output of the register allocator, in a process that is usually called translation validation. We will go over the definition of a correct register allocator, and will show how progress and preservation are key to prove the core property of semantics preservation.
Divergence Analysis: this class introduces the student to general purpose graphics processing units, and describes a static code analysis that we call Divergence Analysis. Divergence Analysis proves that two variables always have the same values for different threads of a SIMD set of threads. The class then moves on to describe a suite of divergence aware compiler optimizations, with emphasis on Divergence Aware Register Allocation.
Instruction Level Parallelism: the goal of this class is to introduce the student to algorithms used to schedule instructions in high-performance architectures which provide long pipelines and the hability to issue multiple instructions each cycle. We will go over basic block scheduling, superblock formation and static profiling. We also analyze a few characteristics of high-performance machines, such as the VLIW design, and predicated instructions.
Automatic Parallelization: in this class we go over a few techniques to transform sequential C-like loops into parallel CUDA-loops. We analyze two main notions: data reuse and data dependences within loop iterations. We then move on to see how we can map points in the loop's iteration space to processor identifiers. Throughout the class we will see a number of actual CUDA kernels, derived from C programs.

## LLVM Course

Slides Ex v3.4 Ex v3.6 Exercises Examples Description
Introduction to LLVM: what is LLVM, how to use it, which tools are available, how to understand its intermediate representation, how to generate machine code, how to invoke common optimizations, etc. You can watch this class online (in Portuguese).
Compiling a language: how to use LLVM to implement a very simple programming language, how to generate an IR, how to optimize the IR using LLVM passes, how to execute the IR via the just-in-time compiler, etc.
Writing an LLVM pass: what is a pass, how to write a simple LLVM pass, how to register the pass, how to invoke it, how to time it. How to invoke a pass from another pass, how to use LoopInfo to analyze loops, etc.
Using the LLVM Test Infrastructure: how to install the LLVM test infrastructure, how the test suite is organized, how to run simple tests, how to gather statistics, how to write custom tests, etc.
Understanding the LLVM API: how to iterate through instructions, how to check the type of an instruction, how to eliminate instructions from the program, how to replace the uses of an instruction, how to use phi-functions, etc.
Interprocedural Analyses: how to write a module pass, how to use alias analysis, how to implement a real-world optimization, how to play with attributes of function arguments, etc.
Longer LLVM Course: this is an one-hour course about LLVM, which I gave in the UFMG's summer school. It is mostly a compilation of the first three slides that we have in this collection (see above).

Última atualização: 21 de Janeiro de 2015.

Last update: January 21st, 2015.