Talking Compilers with ChatGPT (PDF)

Lecture Notes for a 60-Hour Course on Compiler Construction



This PDF contains a collection of lecture notes used in UFMG's Compiler Construction course. This undergraduate-level class covers the basics of compiler construction, including the design and implementation of the front end, middle end, and back end of a compiler. The course generally follows the guidelines outlined in the ACM Computer Science Curricula for programming languages.

The course is project-oriented: students are tasked with building an interpreter and a compiler for a purely functional subset of SML/NJ. The course structure closely follows the approach presented by Robert Nystrom in his book, Crafting Interpreters. However, instead of using an object-oriented language, we use a functional one. Additionally, instead of targeting a stack-based virtual machine, we target RISC-V.

The material in this book was curated with the assistance of two LLMs: ChatGPT and Gemini. The original lecture notes were designed to be question-oriented. To leverage the LLMs, these questions were provided to them with a draft answer and their responses were carefully edited for accuracy and clarity (example). The content of the book is largely derived from publicly available examples, including those found in research papers and forums.

This book is a work in progress, so typos and errors may be present. If you have suggestions, comments, or spot any mistakes, please contact Fernando Pereira at pronesto@gmail.com.

Table of Contents
  1. Introduction
  2. Lexical Analysis
  3. Tree-Like Program Representation
  4. Recursive-Descent Parsing
  5. Bottom-Up Parsing
  6. Parser Generators and Parser Combinators
  7. Variables and Bindings
  8. The Visitor Design Pattern
  9. Type Systems
  10. Type Checking
  11. Type Inference
  12. Anonymous Functions
  13. Recursive Functions
  14. Introduction to Code Generation
  15. Code Generation for Expressions
  16. Code Generation for Statements
  17. Code Generation for Functions
  18. Memory Allocation
  19. Pointers and Aggregate Types
  20. Code Generation for Object-Oriented Features
  21. Heap Allocation
  22. Introduction to Code Optimizations
  23. Data-Flow Analyses
  24. Static Single-Assignment Form
  25. Register Allocation