Introduction to Programming Languages/Parsing

Parsing edit

Parsing is the problem of transforming a linear sequence of characters into a syntax tree. Nowadays we are very good at parsing. In other words, we have many tools, such as lex and yacc, for instance, that helps us in this task. However, in the early days of computer science parsing was a very difficult problem. This was one of the first, and most fundamental challenges that the first compiler writers had to face. All of that must be dealt as an example.

If the program text describes a syntactically valid program, then it is possible to convert this text into a syntax tree. As an example, the figure below contains different parsing trees for three different programs written in our grammar of arithmetic expressions:

 

There are many algorithms to build a parsing tree from a sequence of characters. Some are more powerful, others are more practical. Basically, these algorithms try to find a sequence of applications of the production rules that end up generating the target string. For instance, let's consider the grammar below, which specifies a very small subset of the English grammar:

<sentence> ::= <noun phrase> <verb phrase> .
<noun phrase> ::= <determiner> <noun> | <determiner> <noun> <prepositional phrase>
<verb phrase> ::= <verb> | <verb> <noun phrase> | <verb> <noun phrase> <prepositional phrase>
<prepositional phrase> ::= <preposition> <noun phrase>
<noun> ::= student | professor | book | university | lesson | programming language | glasses
<determiner> ::= a | the
<verb> ::= taught | learned | read | studied | saw
<preposition> ::= by | with | about

Below we have a sequence of derivations showing that the sentence "the student learned the programming language with the professor" is a valid program in this language:

<sentence> ⇒ <noun phrase> <verb phrase> .
           ⇒ <determiner> <noun> <verb phrase> .
           ⇒ the <noun> <verb phrase> .
           ⇒ the student <verb phrase> .
           ⇒ the student <verb> <noun phrase> <prepositional phrase> .
           ⇒ the student learned <noun phrase> <prepositional phrase> .
           ⇒ the student learned <determiner> <noun> <prepositional phrase> .
           ⇒ the student learned the <noun> <prepositional phrase> .
           ⇒ the student learned the programming language <prepositional phrase> .
           ⇒ the student learned the programming language <preposition> <noun phrase> .
           ⇒ the student learned the programming language with <noun phrase> .
           ⇒ the student learned the programming language with <determiner> <noun> .
           ⇒ the student learned the programming language with the <noun> .
           ⇒ the student learned the programming language with the professor .

Grammars · Ambiguity