Language systems ================ * How to run a program? * Which programming languages are usually compiled, virtualized or interpreted? * Why is a language compiled or interpreted (virtualized)? * What is the life cycle of a program? - Write the program using a text editor. - Compile the program to machine code: - Compile the program to assembly. - Translate assembly to machine code. - Link with libraries. - Load program in memory, and replace names with addresses. - Debug the program. The classical Sequence: [editor] --> source file --> [preprocessor] --> preprocessed source file --> [compiler] --> assembly language file --> [assembler] --> object file --> [linker] --> executable file --> [loader] --> running program in memory * What is assembly code? * What is machine code? * What is really the command gcc? In the example below, let's consider the following C file: // cube.c #include #define CUBE(x) (x)*(x)*(x) int main() { int i = 0; int x = 2; int sum = 0; while (i++ < 100) { sum += CUBE(x); } printf("The sum is %d\n", sum); } * How to produce the preprocessed code from a c file? > clang -E cube.c -o cube.p.c * How to visualize the AST of this program? > setllvm; $LLVM/clang -c -Xclang -ast-view cube.p.c * Why does it have so many lines? * Which declarations will be in these lines? * What if I remove the #include from the program? * How to produce an assembly program from a c file? > $LLVM/clang -S -emit-llvm cube.p.c -o cube.p.ll > $LLVM/llc cube.p.ll -o cube.arm * Where is "The sum is %d"? * Where is the loop? * Is this program efficient? * How to produce an object file? > as cube.arm -o cube.o * Where is "The sum is %d"? * Where is printf? * How to link with the external libraries? > ld -syslibroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk cube.o -lSystem /Library/Developer/CommandLineTools/usr/lib/clang/12.0.5/lib/darwin/libclang_rt.osx.a -o cube.exe * Why is the executable so much larger than the object? * what are the differences between the object and the executable? Could you optimize that program? > clang -O1 -S cube.c * Is there any disadvantage in doing some optimization? - It makes it harder to see what assembly is produced for each statement. * There are many levels of optimization. What are the differences between these levels? > $LLVM/opt -O1 --print-pipeline-passes -disable-output cube.p.ll How to have an idea of how many optimizations are used? > python3 >>> x = '...' >>> x.count.(',') * Which data-structures does the compiler use to optimize a program? * Are all the assembly programs the same? * A program written in C compiles to the same assembly as a program written in SML? * When are assembly programs different? - When we compile to different computer architectures. * What is a computer architecture? - Hardware specification - Instruction set. * Give examples of computer architectures. > clang -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -Xclang -disable-O0-optnone -c -emit-llvm cube.c -o cube.bc > $LLVM/llc cube.p.bc -march=arm -o cube.p.arm.s > $LLVM/llc cube.p.bc -march=ppc32 -o cube.p.ppc32.s > $LLVM/llc cube.bc -march=mips -o cube.mips.s * What is the way to execute a program other than compiling? - With an interpreter. * Can you write a bash script to list all the files in a director? #!/bin/bash for f in `ls`; do echo "File -> $f" done How can you run this script? > bash b.bash * Any other way? - With a virtual machine. * What is the difference between an interpreter and a virtual machine? * What are the advantages of virtual machines? - portability. - security. - profiling. * What is a famous virtual machine that you know? - The java virtual machine, which exists in any browser. * How to compile a simple Java program? > cat Cube.java public class Cube { public static void main(String args[]) { int x = 2; int sum = 0; for (int i = 0; i < 100; i++) { sum += x * x * x; } System.out.println("The sum is " + sum); } } > javac Cube.java * How to view the bytecodes? > javap -c Cube.class * What is a '.class'? * What is a just-in-time compiler? - compiles while interprets. * After linking, the size of a program grows considerably. How to avoid this problem? - Use dynamic linking. * How does dynamic linking work? * What is the name of dynamic link libraries in windows? .dll * What about in unix? .so * How is dynamic linking in Java? * What are the advantages of dynamic libraries? - multiple programs can share code in memory. - library code can be updated in separate. - avoids loading code that is never used. * Where is the implementation of printf in the executable of cube.c? - Dynamic linking: > clang cube.c -o dyn.exe > objdump -t cube.exe | grep printf - Static linking > gcc -static cube.c -o static.exe > objdump -x static.exe > objdump -d static.exe > ls *.exe 7140 dyn.exe 577945 static.exe * In our example program: int i; void main() { for (i=1; i<=100; i++) fred(i); } * What set of values is associated with int? - language implementation time as in C, or language specification time as in Java. * What is the type of fred? - compile time. * What is the address of the object code for main? - load time. * What is the implementation of fred? - link time. * What is the value of i? - Runtime. * The binding times are: - Language definition time - Language implementation time - Compile time - Link time - Load time - Runtime * What is defined during language definition time? - meaning of key words. * What is defined during language implementation? - range of values of int in C (but not in Java) * What is defined during compilation time? - type of variables. * What is defined during link time? - code of external functions. * What is defined during load time? - Memory location of code, data, etc * What is defined during run time? - Value of variables. - Type of variables in Perl, JavaScript, Lisp, etc. * What is a debugger? * What are good debugging informations? - where is the program executing, - the trace of execution, - the value of variables. Example: LLDB Tutorial ====================== // Find the problem with the program below: // #include #include #define ARRAY_SIZE 4 int main() { int x[4] = {2, 3, 5, 7}; int sum = 0; for (int i = ARRAY_SIZE - 1; i > 0; --i) { sum += x[i]; } assert(sum == 17); return 0; } $> clang -g ch0.c $> lldb a.out (lldb) b main (lldb) run (lldb) next (lldb) display sum (lldb) next (lldb) display i (lldb) next ... =============================================================================== Examples of execution of different programming languages =============================================================================== * Using rust with LLVM: ======================= > vim fact.rs fn fact(n: u32) -> u32 { if n < 2 { 1 } else { n * fact(n - 1) } } fn main() { println!("Fact(10) = {}", fact(10)); } // Which programming language is this one? > rustc fact.rs > ./fact > rustc --emit llvm-ir fact.rs > $LLVM/opt -dot-cfg fact.ll > ls -la .*.dot | grep fact > dot -Tpdf .*fact*fact*.dot -o fact.pdf > open fact.pdf * Using Julia with LLVM ======================= > julia julia> function fact(x::Int) if x < 2 return 1 else return x * fact(x-1) end end julia> fact(10) 3628800 julia> code_llvm(fact, (Int,)) define i64 @julia_fact_195(i64 signext %0) #0 { ... } julia> code_native(fact, (Int,)) * Using Java and the JVM ======================== > vim T.java public class T { public static void main(String args[]) { System.out.println("Hello, World!"); } } > javac T.java > java T * Using Kotlin and the JVM ========================== > mkdir kt > cd kt > vim hello.kt fun main(args: Array) { println("Hello, World!") } > kotlinc hello.kt -include-runtime -d hello.jar > java -jar hello.jar * How to view the classes that are created? > jar xf hello.jar > find . -name "*.class" * Can you view the bytecodes? > javap -c HelloKt.class * Using Scala and the JVM ========================= > vim HelloWorld.scala object HelloWorld { def main(args: Array[String]): Unit = { println("Hello, world!") } } > scalac HelloWorld.scala > scala HelloWorld * Using Kotlin and LLVM ======================= > cat fact.kt fun fact(n: Int): Int { if (n < 2) return 1 else return n * fact(n-1) } fun main() { val f = fact(10) println("Fact of 10 = $f") } > kotlinc-native fact.kt -o fact > ./fact.kexe Fact of 10 = 3628800 > kotlinc-native -Xprint-bitcode fact.kt -o fact 2> fact.ll > vim fact.ll