Debugging register allocators

Introduction

Implementing register allocators is a daring job. The algorithms tend to be very complex, and worst: once the implementation starts producing the first running programs, most likely there will be bugs. Bugs just happen... and finding bugs... oh, man. That is quite something on its own...

We here in the compilers lab have been using an automatic debugger to find errors in the output of a register allocator. Notice that this is not the same as finding errors in the register allocator itself. If the register allocator produces incorrect code, our debugger most likely will catch the error; however, if the register allocation algorithm contains errors, in general our debugger cannot tell us anything about it. Nevertheless, the debugger was extremely useful in the development of our register allocators. In this page I will be explaining how to use our debugger. To download it, just click here.

What is this debugger like?

Our debugger uses typing rules to find errors in the output of register allocators. A detailed description of this mechanism can be found in this paper. The debugger reads programs written in a language called smira. If file.sm is a smira program, you can use the debugger with the following command line:

java -classpath TypeChecker.jar typeChecker.TypeCheckApp file.sm

If file.sm does not contain any errors, the debugger will output the message: The program type-checks!. On the other hand, if errors are found, the debugger will output a list with all the typing errors found:

There are typing errors:
...

More details about the types of errors is given later on in this webpage.

The debugger can also be used to draw the control flow graph of smira programs. These graphs are outputted in the dot format, which is a standard for describing graphs. A good application to visualize the graph encoded in a dot file is Graphviz. This is a wonderful piece of open software and one only gains for knowing more about it! Anyway, to produce a dot description of a smira program, just type:

java -classpath TypeChecker.jar typeChecker.PrintGraphApp file.sm

And a textual description of the control flow graph will be printed in the standard output. Copy this description and paste it on a dot file so that it can be opened with Graphviz.

The smira language

Smira is a very simple language that describes how variables are bound to registers. Its grammar, in javacc format, is available here.

A program in smira is a set of basic blocks, and each basic block is a sequence of instructions. Instructions come in two types: copies and jumps. The last, and only the last instruction of any basic block must be a jump. Each basic block has a label, which is always in the form L. Targets of jumps are also given in this form. The most ubiquitous type of instruction are the copies. Copies have no operands, only uses and definitions. The basic value in smira is a binding. A binding is a pair, where the first element represents a physical location, and the second represents a program variable, e.g (R0, P1) means that variable P1 is bound to variable R0. In smira, physical locations are always represented as R(int). Variables are always represented as P(int). This 'R' stands for register, and that 'P' stands for pseudo register. We give the example of an smira program and its control flow graph below:

/*
 * Function name: Example
 * Number of basic blocks: 4
 * A basic block is always marked
 * by a label. This program has
 * four basic blocks: L0, L1,
 * L2 and L3.
 */

L0
(R0, P0) = ;
(R1, P1) = ;
 = (R0, P0) ;
JMP L1 L2 ;


L1
// Store P1 in address R100001.
// We use big integer numbers to
// represent the memory addresses.
(R100001, P1) = (R1, P1) ;
(R1, P2) = (R0, P0) ;
(R1, P0) = (R1, P2) ;

// Load P1 from address R100001:
(R0, P1) = (R100001, P1) ;

// Swaps the contents of R0 and R1:
(R0, P0) (R1, P1) = (R1, P0) (R0, P1) ;
JMP L2  ;


L2
 = (R0, P0) (R1, P1) ;
JMP L3  ;

/*
 * In smira, a basic block always
 * ends with a jump instruction.
 * In order to simulate the end of
 * the program, a common trick is
 * to create a basic block that
 * only contains one instruction:
 * a jump to itself.
 * In this example program, this
 * final basic block is called L3.
 * Every final basic block can then
 * jump to this basic block.
 */
L3
JMP L3  ;

A Short Tutorial

This section gives some clues about how to use the debugger to speed up the development of register allocation algorithms.

Understanding Errors

It may be illustrative to use the program above to test the debugger. First, download the jar file that contain the implementation of the debugger, and save the smira program above in a file such as ex.sm. The command:

java -classpath TypeChecker.jar typeChecker.TypeCheckApp ex.sm

should produce the output:

Typechecking ex.sm
The program type-checks!

1) inst_ovt

Now, try adding some "sample errors" into the program. For instance, replace the second instruction in L0 by (R0, P1). This will cause variable P1 to be defined into register R0. However, this register is already in use, holding variable P0. If the type-checker is invoked, an error should be reported!

Typechecking ex.sm
There are typing errors:
INST_OVT: (R0, P0) and (R0, P1) at
instruction 1: ((R0, P1)  = )
 Preds: 0
 Succs: 2
 In: (R0, P0) (R1, P1)
 Out: (R0, P0) (R1, P1)
 from BB_0

Errors of type INST_OVT imply that a register is being overwritten while it is still alive. That is exactly what is happening in this case. The error message points the instruction where the error was detected. It was detected at the second instruction (we start counting from zero) of basic block BB_0, the first basic block. The fields Preds and Succs mark the predecessor and successor instructions of instruction 1. The fields In and Out give liveness information for that instruction.

2) Undefined variables

INST_OVT is one of the most common types of errors. Another one that the debugger is very good at finding is due to undefined variables. It happens when a variable is defined in a register, but is used in other. This is an error, even if no live register is being overwritten. To see how it happens, replace the first instruction of the example program with (R2, P0) = ;. Whereas before variable P0 was defined in R0, now it is being defined in register R2, which is different from the register where P0 is expected to be found, when used in the last instruction in block L0. Your new basic block should look like this:

L0
(R2, P0) = ; // Change this binding here.
(R1, P1) = ; // Remember to set this binding back.
 = (R0, P0) ;
JMP L1 L2 ;

If you run our type-checker again, you will end up with something like:

Typechecking ex.sm
The program type-checks!
There are undefined variables:
(R0, P0)

The program still type-checks, because no live register is being overwritten, but the undefined variable is pointed by the debugger! Notice that not all undefined variables are errors. Sometimes programs assume that some registers will be always defined, like the stack pointer for instance. The stack pointer normally is not defined explicitly in the assembly code.

3) bad_alloc

The last type of error happens when a register is expected to contain one pseudo variable, but indeed it contains another. To see this, replace the first basic block of the example program with:

L0
(R0, P0) = ;
 = (R0, P3) ; // Ah, R0 contains P0, not P3!
(R1, P1) = ;
 = (R0, P0) ;
JMP L1 L2 ;

If you run the debugger, e.g:

java -classpath TypeChecker.jar typeChecker.TypeCheckApp ex.sm

You should get something like:

Typechecking ex.sm
BAD_ALLOC: (R0, P3) and (R0, P0) at
 instruction 1: ( = (R0, P3) )
 Preds: 0
 Succs: 2
 In: (R0, P3) (R0, P0)
 Out: (R0, P0)
 from BB_0

There are typing errors:
INST_OVT: (R0, P3) and (R0, P0) at
 instruction 0: ((R0, P0)  = )
 Preds:
 Succs: 1
 In: (R0, P3)
 Out: (R0, P3) (R0, P0)
 from BB_0

There are undefined variables:
(R0, P3)

Quite a lot of things, isn't it? The bad allocation happens at instruction 1, but because the type-checker assumes that variable P3 is a live-in value, it also reports the overwriting at instruction 0.

Modeling different features

In spite of its simplicity, smira allows to model a wide range of features normally found in register allocation. We describe some of our "tricks" here.

1) Pre-colored registers

Architectural constraints normally force some variables to be pre-assigned to some registers. For instance, in the PowerPC, function calls use registers to pass variables into function, and these registers must be always pre-defined. I normally use variables with the same number as the register to represent function calls, and I save higher numbers for ordinary variables. For instance, consider the program in the left below, which represents a function call in PowerPC. This program is represented by the smira program in the right:

Function call
PowerPC

SMIRA representation

a := 1          ;
b := 2          ;
c := "%d, %d\n" ;
R1 := a         ;
R2 := b         ;
R3 := c         ;
bl printf       ;

(R2, P1024) =                        ;
(R3, P1025) =                        ;
(R4, P1026) =                        ;
(R1, P1) = (R2, P1024)               ;
(R2, P2) = (R3, P1025)               ;
(R3, P3) = (R4, P1026)               ;
(R0, P0) (R1, P1) ... (R128, P128) = ;

The variable a is represented, in the smira program, by the pseudo register P1024. In the same way, we have that variable b corresponds to P1025 and c corresponds to P1026. The function call may overwrite all the caller save registers, and so we represent the call as an instruction that defines all these registers.

2) Loads/Stores

I normally use small integers to represent the physical registers, and big integers to represent memory addresses. For instance, in x86 there are in total 56 registers, including alias names and those weird MMX registers. Thus, I leave the numbers 0 to 55 to represent registers, and use numbers from 100000 and bigger to represent memory addresses. In our example program, the first instruction of block L1 is a store, and the fourth instruction of that block is a load:

L1
(R100001, P1) = (R1, P1) ; // store
(R1, P2) = (R0, P0) ;
(R1, P0) = (R1, P2) ;
(R0, P1) = (R100001, P1) ; // load
...

3) Swaps

Many register allocators use swap instructions. Swaps allow to split live ranges while still keeping the number of registers low. Swaps can be implemented in real assembly code in a number of ways. For instance, with three xor or with two additions and one subtraction. Some architectures even have instructions for swapping two registers, like xchg in x86. Describing swaps in smira is very simple. Just use instructions that define two or more registers. For instance, below I am swapping the locations of three variables:

(R0, P1) (R1, P2) (R2, P0) =  (R0, P0) (R1, P1) (R2, P2) ;

4) Aliasing

In some architectures, some registers may alias. We say that two registers alias when an assignment to one modifies the value in the other. A classical example is found in the register bank of x86. For instance, the following registers are aliases: AH, AX, EAX. Also AL, AX, EAX, but not AH, AL, because an assignment to AH does not have any influence in AL and vice-versa. In order to represent aliasing in smira, every time a register is defined, I assume that all its aliases are also being defined. For instance, assume that registers are represented in the following way in the x86: AL = R1, AH = R2, AX = R3 and EAX = R15. In this case, instead of writing instruction such as (R1, P0) which defined P0 in the register AL, I would write something like:

(R1, P0) (R3, P0) (R15, P0) = ; // writes P0 into AL, AX and EAX

In the same way, if variable P0 were defined in register AX then I would have to overwrite both AH and AL, without forgetting EAX:

(R1, P0) (R2, P0) (R3, P0) (R15, P0) = ;

5) Undefined Registers

Undefined bindings may be a sign that something is wrong in the register allocation directives. However, undefined variables may also be live-in in the target function. For instance, normally the stack pointer is live-in at the beginning of a function. Assume that the stack pointer, e.g ESP is represented by variable P0, and that this variable must be always stored in the register R0. In order to avoid always getting the error message:

There are undefined variables:
(R0, P0)

You can simply defined the pair (R0, P0) as the first instruction of the first basic block:

L0
(R0, P0) = ;
...

Normally you can safely define all the pairs live-in at the beginning of the function in the top of the first basic block of your smira program.

To LLVM Users

I have coded a spiller to print the output of LLVM's register allocator in smira format. The code for the spiller is available here. Add this code to VirtRegMap.cpp. To produce smira output, you must use something like:

llc -f -regalloc=linearscan -spiller=print -disable-spill-fusing exec.bc -o exec.s

It is necessary to disable spill fusion, otherwise stores will not be printed, and many undefined variables will be reported by the type-checker.