Introduction to Programming Languages/Logic Grammars
In this chapter we will explore how grammars are used in practice, by compilers and interpreters. We will be using definite clause grammars (DCG), a feature of the Prolog programming language to demonstrate our examples. Henceforth, we shall call DCGs Logic Grammars. Prolog is particularly good at grammars. As we will see in this chapter, this programming language provides many abstractions that help the developer to parse and process languages.
Logic Grammars
editProlog equips developers with a special syntax to implement grammars. This notation is very similar to the BNF formalism that we had seen before. As an example, the English grammar from the last chapter could be rewritten in the following way using prolog:
sentence --> noun_phrase, verb_phrase . noun_phrase --> determiner, noun . noun_phrase --> determiner, noun, prepositional_phrase . verb_phrase --> verb . verb_phrase --> verb, noun_phrase . verb_phrase --> verb, noun_phrase, prepositional_phrase . prepositional_phrase --> preposition, noun_phrase . noun --> [student] ; [professor] ; [book] ; [university] ; [lesson] ; [programming language] ; [glasses]. determiner --> [a] ; [the] . verb --> [taught] ; [learned] ; [read] ; [studied] ; [saw]. preposition --> [by] ; [with] ; [about] .
If we copy the text above, and save it into a file, such as grammar.pl
, then we can parse sentences. Below we give a screenshot of a typical section of the Prolog Swipl interpreter, showing how we can use the grammar. Notice that a query consists of a non-terminal symbol, such as sentence
, a list containing the sentence to be parsed, plus an empty list. We will not explain this seemingly mysterious syntax in this book, but the interested reader can find more information on-line:
1 ?- consult(grammar). 2 % ex1 compiled 0.00 sec, 0 bytes 3 true. 4 5 ?- sentence([the, professor, saw, the, student], []). 6 true ; 7 false. 8 9 ?- sentence([the, professor, saw, the, student, with, the, glasses], []). 10 true ; 11 true ; 12 false. 13 14 ?- sentence([the, professor, saw, the, bird], []). 15 false.
Every time Prolog finds a derivation tree for a sentence it outputs the value true
for that query. If the same sentence has more than one derivation tree, then it succeeds for each and all of them. In the above example, we got two positive answers for the sentence "The professor saw the student with the glasses", which, as we had seen in the previous chapter, has two different parsing trees. If Prolog cannot find a parsing tree for the sentence, then it outputs the value false
. This happened in line 15 of the above example. It also happened in lines 7 and 12. Prolog tries to find every possible way to parse a sentence. If it cannot, even after having found a few successful derivations, then it will give back false
to the user.
Attribute Grammars
editIt is possible to embedded attributes into logic grammars. In this way, we can use Prolog to build attribute grammars. We can use attributes for many different purposes. For instance, below we have modified our English grammar to count the number of words in a sentence. Some non-terminals are now associated with an attribute W, an integer that represents how many words are derived from that non-terminal. In the compiler jargon we say that W is an synthesized attribute, because it is built as a function of attributes taken from child nodes.
sentence(W) --> noun_phrase(W1), verb_phrase(W2), {W is W1 + W2} . noun_phrase(2) --> determiner, noun . noun_phrase(W) --> determiner, noun, prepositional_phrase(W1), {W is W1 + 1} . verb_phrase(1) --> verb . verb_phrase(W) --> verb, noun_phrase(W1), {W is W1 + 1} . verb_phrase(W) --> verb, noun_phrase(W1), prepositional_phrase(W2), {W is W1 + W2} . prepositional_phrase(W) --> preposition, noun_phrase(W1), {W is W1 + 1} . noun --> [student] ; [professor] ; [book] ; [university] ; [lesson] ; [glasses]. determiner --> [a] ; [the] . verb --> [taught] ; [learned] ; [saw] ; [studied] . preposition --> [by] ; [with] ; [about] .
The queries that use the attribute grammar must have a parameter that will be replaced by the final value that the Prolog's execution environment finds for the attribute. Below we have a Prolog section with three different queries. Ambiguities still lead us to two answers in the second query.
?- consult(grammar). % ex1 compiled 0.00 sec, 0 bytes true. ?- sentence(W, [the, professor, saw, the, student], []). W = 5 ; false. ?- sentence(W, [the, professor, saw, the, student, with, the, glasses], []). W = 7 ; W = 7 ; false. ?- sentence(W, [the, professor, saw, the, bird], []). false.
Attributes can increase the computational power of grammars. A context free grammar cannot, for instance, recognize the sentence a^nb^nc^n of strings having the same number of a's, b's and c's in sequence. We say that this language is not context-free. However, an attribute grammar can easily parse this language:
abc --> as(N), bs(N), cs(N). as(0) --> []. as(M) --> [a], as(N), {M is N + 1}. bs(0) --> []. bs(M) --> [b], bs(N), {M is N + 1}. cs(0) --> []. cs(M) --> [c], cs(N), {M is N + 1}.