Administrative stuff: 1) Did everybody got my e-mail? 2) Are there questions from the last class? - One point that may not have been made clear is that functional and imperative are not absolute paradigms. Most of the languages provide characteristics of both, e.g ML has references... 3) Did anybody get into Euler's? 4) Did anybody get the course slides? 5) Did anybody read the material for today's class? -------------------------------------------------------------------------------- 0) The sentense "I jump" makes perfect sense and sounds good to me The sentense "The chair jumps" does not make sense, but sounds good The sentence "I of" does not make sense, nor sounds good, but I can read it The sentence "oIf" is unreadable The sentence "o#u@f" is unspellable What is the difference in each case? 1) How I would describe a programming language? - syntax - semantics 2) Has anybody heard of Noam Chomsky? What about John Backus? Grammar is - T is a set of tokens, - N is a set of nonterminals, - P is a set of productions, - S is a start symbol. - Tokens form the vocabulary of the language; - Nonterminals are not part of the language they are enclosed in brackets ; - Production is := ... Productions may have '|' to separate alternatives. - The start symbol forms the root of any parse tree for the grammar. 2.1) Have you ever seen the grammar of a true programming language? Example: ::= "+" | "*" | "(" ")" | "a" | "b" | "c" Parsing is the problem of building parsing trees for strings. 3) Example: show parse trees for each of these strings: a + b (a + b) (a + (b)) Ambiguity 4.1) Build a parsing tree for the following sentence: a * b + c 4.2) How many parsing trees are there? 4.3) Is there a "correct" parsing tree? A grammar is _Ambiguous_ if there exists a sentence for which the grammar allows the derivation of two different parsing trees. 5.1) Does anybody see the ambiguity in this grammar? ::= if then | if then else 5.2) How to read: if (a > b) then if (c > d) then print(1) else print(2) -------------------------------------------------------------------------------- Applications We will be only dealing with Context-free grammars: the left side must be a single non-terminal. 6) How would be a grammar for the assembly language that we saw last class? P ::= I | I P I ::= mov A A | add A A | store A A | load A A | jump A A ::= $num | eax, ebx, ecx, edx, ebp, esi, edi, esp 7) Try to come up with a grammar for Portuguese: ::= ::= ''o menino | ''o pato com a pena ::= ''canta | ''surpreende o menino | ''toca o piano com amor ::= ''ate o telescopio ::= "menino" | "menina" | "pato" | "telescopio" | "musica" | "pena" ::= "com" | "ate" ::= "viu" | "esta" | "e" | "canta" | "surpreende" | "toca" ::= "um" | "uma" | "o" | "a" 8) Can you build ambiguous sentences using our Portuguese grammar? Ex.: A menina toca o pato com a pena. Does pena belongs to pato, or is it the tool used by menina? 9) How can this grammar produce an infinite string? -------------------------------------------------------------------------------- We recognize languages using programs called parsers. Prolog is particularly good for writing parsers. 10) Write the parser for the Portuguese grammar in Prolog. - portug.pl 11) Show that 'a,menina,toca,o,pato,com,a,pena' has two derivations, whereas 'a,menina,toca,o,pato' has only one. 12) Play a little bit with the grammar, specifying different variables in some productions, e.g: """ sentenca([a,Sujeito,toca,o,pato],[]). sentenca --> expr_nominal, predicado. expr_nominal --> artigo, nome. expr_nominal --> artigo, nome, expr_preposicional. predicado --> verbo. predicado --> verbo, expr_nominal. predicado --> verbo, expr_nominal, expr_preposicional. expr_preposicional --> preposicao, expr_nominal. nome --> [menino]; [menina]; [pato]; [telescopio]; [musica]; [pena]. preposicao --> [ate]; [com]. verbo --> [e]; [viu]; [esta]; [toca]; [canta]; [surpreende]. artigo --> [a]; [o]; [uma]; [um]. """ In prolog, ',' means AND, and ';' means OR. non-terminals are enclosed in []'s. -------------------------------------------------------------------------------- 14) Let's write a grammar for numbers in Prolog? - number.pl ::= | ::= "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" In prolog, this grammar looks like: number --> digit, number. number --> digit. digit --> [0] ; [1] ; [2] ; [3] ; [4] ; [5] ; [6] ; [7] ; [8] ; [9]. 15) How to add signal to these numbers? E.g: sigNum([-, 3, 4, 5], []). sigNum --> signal, number. sigNum --> number. signal --> [-]; [+]. 16) How to add a decimal dot to the signed numbers? E.g: decNum([-, 2, 3, 4, ., 4, 5], []) decNum --> sigNum, dot, number. decNum --> sigNum. dot --> [.]. -------------------------------------------------------------------------------- If every production of a grammar is in the form

::= t , or

::= t, or

::= \epsilon, then the grammar is said to be regular. Tokens are normally described by regular grammars. Phrasal structure Lexical structure A compiler: scanner and parser. -------------------------------------------------------------------------------- There is a hierarchy of grammars. 17) Example: Could you come up with a grammar to recognize strings having equal numbers of a's, b's, and c's in that order? Namely, the set { abc, aabbcc, aaabbbccc, ... }? a) regular grammars. - Finite state automaton. b) context-free grammars. - Non-deterministic Pushdown automaton. c) context-sensitive grammars. - Linear bounded automaton: Non-deterministic Turing machine with bounded tape d) Unrestricted grammars. - Turing machine 18) So, in the end, what are grammars good for? 18.1) Could you write a script to count the number of functions in C++ files? Would it work with the file below? class X { public: X(int a): _a(a) {} private: int _a; }; int foo (int i, int j) { X x(2); return i + j; } int main() { return foo(1, 2); }