Government and Binding Theory/Printable version

Government and Binding Theory

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Principles and Parameters

Government and Binding Theory
Printable version X-bar Theory

No discussion on GB theory can begin without addressing the two core concepts: principles and parameters. In fact, GB theory is often said to be a misnomer because government and binding are simply two of the many modules in GB theory. Thus, the theory is more accurately named Principles and Parameters (P&P) Theory.

The purpose of P&P theory: Generalisation


X-bar Theory

Government and Binding Theory
Principles and Parameters X-bar Theory IP and CP

The X-bar theory is one of the core modules of GB theory, and as such, it will be the first we learn. Before we embark on our X-bar journey, though, it is necessary to look at why we need X-bar in the first place.



In our introductory book on linguistics, we have devised a mini-grammar using rewrite rules:

A mini-grammar of English

NP → {PN, Pr, Det (Adj) N}
VP → V (NP) (Adv)
S → NP (Aux) VP
PN → {Chomsky, Jackendoff, Pinker}
Det → {this, that, the, a, my, some}
Adj → {happy, green, lucky, colourless}
N → {computer, book, homework idea}
V → {defended, attacked, do, eat, slept, poisoned}
Adv → {furiously, happily, noisily}
Aux → {will, may, might do}

However, our mini-grammar can be very problematic. Refer to this sentence:

(1) *Chomsky slept a computer.

This sentence is syntactically unsound because slept is an intransitive verb. Whether a verb is transitive or intransitive depends, of course, on the verb. Thus we may conclude that a subcategorisation frame in the lexicon tells us whether a verb is transitive.

Subcategorisation frames in verbs

jump [__ Ø]
eat [__ (NP)]
wait [__ (PPfor)]

jump is intransitive, so nothing goes after it. eat can be both transitive or intranstive, so we added an optional (NP). wait can take an indirect object, so we added a prepositional phrase.

This phenomenon is not restricted to VPs by any means. We can similarly construct subcategorisation frames for APs:

Subcategorisation frames in adjectives

doubtful [__ (PPof]
satisfied [__ (PPwith)]
obvious [__ (PPtp)]

Now we encounter a dilemma. We need both the subcategorisation and phrase structure rules to generate sentences, but is it not redundant to have a rewrite rule that says VP → V (NP) (Adv) and individual subcategorisation rules? To solve this problem, linguists have proposed the X-bar theory.



Instead of having both phrase structure and subcategorisation rules, linguists have suggested that the structure of a phrase is derived from the lexicon by a process called projection. When we take a lexical constituent and plug it into our structure, it will take the subcategorisation information with it to project a structure.

To understand this concept, we first need to know that there are only two phrase structure rules in X-bar theory. (Yes, all our old ones will be discarded. In a moment, we'll get a boot to kick them away.)

Phrase structure rules in X-bar theory

X′ → X YP
X″ → spec X′

X is known as the head of the phrase, and is always a lexical category. Every phrase is named after the head. X itself is known as zero projection and can be written as X0.

YP is the complement of the phrase. It is always a phrase. It is selected by the head with lexical information, i.e. it is subcategorised by the head.. X′ is known as intermediate projection.

X″ is just another way of writing XP, and spec is the specifier of the phrase. X″ is the maximal projection.

Let's return to the cheesy sentence we discussed in the introductory book. This time, we include the auxiliary in our verb phrase as the specifier of the VP:


Let's do the same for an NP. Since we're obsessed with homework, our NP will be the completion of homework:


Just to prove that it always works, let's draw bang on time (which, presumably, is how you will hand in your homework):


Cool with that? Good. Lastly, let's do the same with an AP, really obsessed with homework:


This gives rise to our first principle, in the words of Chomsky:

Projection Principle

Representations at each level of syntax are projected from the lexicon in that they observe the subcategorisation properties of lexical items.

Binary branching revisited


Recall that in our introductory book, we met the Binary Branching Condition. Now we can make sense of this condition!

Binary Branching Condition

Each node must have at most two branches.

Terminal nodes have no branches at all. X″ without a specifier and X′ without a complement have one branch. X″ with a specifier and X′ with a complement have two branches. We cannot get any more.

Note that every phrase must have a head.

Phrase structure rules get the boot


Seeing the great generalisation powers of X-bar theory, we can get rid of phrase structure rules now. Just for fun, we will use a boot from the 1600s:


You may be thinking that our booting is quite premature because our new rewrite rule would result in nonsensical phrases:

(2a) *the computer Chomsky
(2b) *happy the idea

Our response to that is, well, deal with it. There is another module of GB theory related to this phenomenon, and we'll come back to it soon enough. For now, though, we need to deal with a rather more pressing problem. What the heck do we do with the residue of our old phrase structure days... the sentence? The rewrite rule is still here, and it is not in X-bar format:

(3) S → NP VP

What do we do with it?

IP and CP

Government and Binding Theory
X-bar Theory Printable version Recursion

In our last chapter, we discovered that auxiliaries are part of the VP, and thus reverted to this phrase structure rule for sentences:

(1a) S → NP VP


Still, this is highly problematic as it is not an X-bar rule!

You could, of course, treat NP as a head, and S as a third-level projection of VP.

(1b) VP‴ → NP V″


There are two problems with this that violate what we have learnt in the last chapter:

  • Heads must be lexical categories.
  • We only have two levels of projection.

We will now look for a new phrase to replace the sentence.

Inflection as a head


There must be some mysterious lexical category that serves as the head of the sentence. What could it be? Firstly, we should look at the word to. You might be tempted to think that to is simply a specifier:


Yet there is good reason to believe that this is not the case. A specifier cannot choose its head. The behaviour of to indicates that it is pretty capable of choosing what comes after it:

(2a) to do homework
(2b) *to does homework
(2c) *to did homework

Recall that in our introductory book on linguistics, we learnt that inflection occurs when we mark a word with one or more grammatical categories. Clearly, to only precedes verbs without inflection. Thus we can reach this subcategorisation frame:

Subcategorisation frame of to

to [__ VP[+bare]]

Since to is obviously not a specifier and can choose what comes after it, we are forced to conclude that it is a head.

Yet to applies only to non-finite clauses, i.e. clauses that are not inflected. Then what could possibly apply to finite clauses? Yep, you guessed it - the finite inflection.

To is not alone in requiring uninflected verbs after it. Modal auxiliaries are perfectly capable of doing this as well:

(3a) will do homework
(3b) *will does homework
(3c) *will did homework

Subcategorisation frames of modal auxiliaries

will [__ VP[+bare]]
may [__ VP[+bare]]
can [__ VP[+bare]]

This gives rise to our new lexical category, inflection (INFL/I), which projects inflection phrases (IP).

Inflection Phrase

The sentence is not a further projection of the VP, but another phrase with the INFL (I) as the head, the VP as the complement and the subject as the specifier.

As the inflection is a functional category, this type of structure is called a functional projection, contrasted with lexical projection. Here is an example of IP:


Unlike last time, the specifier is not optional. This gave rise to a new principle:

Extended Projection Principle

Each clause must have a noun phrase (or, as we will soon see, a determiner phrase) in its subject position.

Complement Clauses


In our introductory book, we mentioned complement clauses. These allow us to form sentences in a recursive manner, English examples being if, whether and for. Since we've already dealt with IPs, let's save ourselves the trouble of the trial-and-error process and form the complementiser phrase (CP) by analogy.

Complementisers can also choose the type of clause that can come after them. Consider: (4a) I think that you are a genius.
(4b) *I think that you to be a genius.
(5a) It is strange of you to think that I am a genius.
(5b) *It is strange of you think that I am a genius.

(4b) and (5b) are wrong, which allows us to construct subcategorisation frames for complementisers:

Subcategorisation frames of complementisers

that [__ IP[+fin]]
if [__ IP[+fin]]
for [__ IP[+inf]]
of [__ IP[+inf]]

This allows us to construct the complementiser phrase.

Complementiser Phrase

The complement clause is not a further projection of the IP, but another phrase with the Complementiser (C) as the head and the IP as the complement.

You may have noticed that we copied the Inflection Phrase definition above and pasted it here. This is the beauty of the X-bar theory: The same structure applies to all elements. (OK, we deleted the specifier of the complementiser phrase as we are not in a position to discuss this yet.)

Here are some examples of CP trees:


One final note on CPs: They have the semantic function of determining the illocutionary force of the CP. The complement of that must be a declarative, while that of if must be an interrogative, for example.

Speaking of recursion, there's another type of recursion that we've covered in our introductory book, but so far failed to address within the GB framework. They are adjuncts.


Government and Binding Theory
IP and CP Printable version DP Hypothesis

Last chapter, we covered one type of recursion we have met before: the complementiser clause. There is another type of recursion we have thus far failed to explain: The adjunct.

Adjuncts are recursive beings


Adjuncts occur after the phrase, so we might be tempted to think that they are complementisers. This is wrong on many levels:

  • Unlike complements, adjuncts are recursive, and we can append lots of these guys.
  • Unlike complements, adjuncts are not subcategorised by the head.
  • Unlike complements, adjuncts can occur before the head: You are really, really annoying.

This leads us to believe that there is another system concerning adjuncts. So, without further ado, let's introduce it:

Rewrite rule for adjuncts

X′ → X′ YP

Note that the projection level remains unchanged after the addition of the adjunct. This is the beauty of the system: It allows for recursion.

Let's look at a few examples of this type of recursion in action:



Moreover, this also explains why adjuncts are not as close to the head than the complement:


This schema would not allow *I talk sadly about the incident.

Of course, this raises the problem of adjuncts that do occur closer to the head than the complement, as in this French example:

(1) Je parle tristement de l'affaire.
(I talk sadly about the incident)

de l'affaire is subcategorised by parler and is thus not regarded as an adjunct, yet it is closer to the head than tristement, an adjunct. This can be solved by changing our rewrite rule:

Rewrite rule for adjuncts

Xn → Xn Ym, where n → {0, 1, 2} and m → {0, 2}

In a few chapters, we will discuss an alternative solution to this phenomenon.

DP Hypothesis

Government and Binding Theory
Recursion Printable version Split INFL Hypothesis

Although we have so far treated the determiner and possessor as the possible specifiers of the noun phrase, there are several problems with this:

  • Determiners are functional, but possessors are lexical.
  • Determiners are words, but possessors are phrases.
  • The two are not in complementary distribution in some languages, like Hungarian. Since the two can co-exist, they cannot both fill up the specifier slot.
  • There is some evidence suggesting that the determiner is a head. For example, incorporation is a process that whereby two heads are joined to form a complex head. In French, the prepositions à and de can join with the determiners le and les through incorporation:

(1a) *Je suis à le parc → Je suis au parc
(1b) *Je demande à les étudiants → Je demande aux étudiants
(1c) *la plupart de les étudiants → la plupart des étudiants

Since the determiner seems to be a head, the determiner phrase has been suggested as its maximal projection.

The Determiner Phrase


The determiner phrase suggested by Abney has an X-bar structure like any other phrase:


This gives us a 'bare-bones' structure of the DP. Let's flesh it out by filling up the specifier slots of the DP and the NP.

Phrases with possessives


Let's consider the case of DPs with possessives. Under the DP hypothesis, the possessive phrase is assumed to be the specifier of the DP. This leads to an important question, though: What sits in the head position of the DP?


For now, we will insert here an empty pos determiner:


Phrases with post-determiners


The DP hypothesis also provides a home for post-determiners, like many and several. They have determiner-like properties, but are not exactly determiners, and they have adjunct-like properties, but aren't exactly adjuncts, as we can see here:

(2a) The many people (2b) *The many several people

(2a) shows that they aren't in complementary distribution with determiners, but (2b) shows that they aren't adjuncts. The specifier position of the NP can now accommodate these determiner-like entities:





Split INFL Hypothesis

Government and Binding Theory
DP Hypothesis Printable version VP-Shell Hypothesis

In languages like English or French, tense and agreement are often represented by a single morpheme. Consider:

(1a) He drinks milk.
(1b) Ils sautaient.
(They jump-imperfect)

In (1), -s indicates both third-person singular singular and present, while in (2), -aient indicates third-person plural and imperfect (past tense + imperfective aspect).

However, not all languages are like this. Some languages have no agreement at all, such as Mandarin (Putonghua):

(2) Wo/ni/ta/women/nimen/tamen qu-le caiguan.
(I/you-sing/(s)he/we/you-pl./they go-asp. restaurant)

Others have two separate morphemes for tense and aspect, like Finnish:

(3) otin

In the Finnish example, -i is the imperfect suffix and n is the person suffix.

To account for multiple inflectional morphemes, the Split INFL Hypothesis has been proposed. It suggests that IPs are actually split into agreement phrases (AgrP) and tense phrases (TP).

Finnish structure


As we have seen in the Finnish example above, the tense morpheme is attached before the person morpheme. This suggests that the tense is closer to the verb than the person, i.e. this structure:


The Agr and T morphemes move and stick to the VP later, as we will soon see.

Pollock's structure


Jean-Yves Pollock famously proposed the Split INFL Hypothesis in 1989 through a comparison of French and English syntax. As this is just a beginner's book, we will offer a simplified account of his proposal without going into marginal cases and exceptions involving auxiliary verbs. Refer to the following sentences:

(4a) Je mange toujours des pommes.
(I eat always art. apples)
(4b) I always eat apples.
(4c) Je ne mange pas de pommes.
(I ne eat not art. apples)
(4d) I do not eat apples.

This gives us the (tentative) analysis below, in which English finite inflections move to V position stick there, whereas French verbs move to I position and stick there, indicating a parametric difference (we omitted the ne, as in colloquial French, to keep things simple):



Is this analysis sufficient? Obviously, it isn't, or it wouldn't have led Pollock to the hypothesis. This was the reason:

(5a) Ne pas manger de pommes
(ne not to-eat art. apples)
(5b) Not to eat apples

Interestingly, in French infinitives, the verb is to the left of not, not much different from English. What's going on? One could potentially suggest that French verbs don't move out of the VP in infinitives:


Yet leads to another problem:

(5c) Ne pas manger toujours de pommes
(ne not to-eat always art. apples)

We're now cornered. It's a 'damned if you do, damned if you don't' situation: If the V moves, we cannot accommodate pas, which is left of the verb, and if the I moves, we cannot accommodate the adverb, which is right of the verb. There has to be something between pas and the adverb, but what is it? Pollock suggests that there are actually three layers of stuff involved: the agreement phrase (AgrP), the negative phrase (NegP), consisting of the ne head and the pas specifier, and the tense phrase (TP). The details of how he came to this conclusion are too complex to be presented here, but the following diagram should show the relationship between the phrases:


Chomsky's structure


If we look carefully at French, it may in fact appear that the tense and agreement in French aren't a single morpheme after all. Refer to the following:

Conjugations of the imperfect indicative of the French verb prendre

Je prenais
Tu prenais
Il prenait
Nous prenions
Vous preniez
Ils prennaient

Conjugations of the present indicative of the French verb connaître

Je connais
Tu connais
Il connaît
Nous connaissons
Vous connaissez
Ils connaissent

The above data suggest that -s, -s, -t, ons, ez and ent are actually typical person suffixes of French verbs, while ai and i act as imperfect suffixes. In fact, as we can see from prendre, the tense morpheme precedes the person morpheme, just like in Finnish, suggesting that AgrP comes before TP!

VP-Shell Hypothesis

Government and Binding Theory
Split INFL Hypothesis Printable version Head Parameter

So far, in the analysis of X-bar theory, we have only looked at verbs with one complements. However, a problem arises: What about two complements? Here are some examples:

(1a) I gave Mary a pen.
(1b) He got a warning from the teacher.

In (1), both sentences have two complements. It may be tempting to give them this structure:


However, that violates the Binary Branching Condition. Now, it may again be tempting to try this structure, where the second complement is placed in adjunct position:


Indeed, for some time, these were the standard treatments of double complement constructions in GB theory. But do they work?

Asymmetries of double objects


Treating complements as adjuncts or VPs as ternary branching structures are not without problems. While ternary branching works somewhat for examples like (1b), both fail at verbs VP + NP + NP constructions like (1a), as we will demonstrate here. However, before we proceed, we need to look at the concept of c-command.



α c-commands β iff the node immediately dominating α also dominates β.

In simpler terms, α c-commands β iff β is the sister of α or is dominated by the sister of α.

You can find c-commanded nodes visually by following a simple rule: Go up one then down one or more, and any node you land at will be c-commanded by the node you started at:


Problem 1: Reflexives and Reciprocals


Reflexive and reciprocal pronouns can only be referentially dependent on expressions that c-command them, as you can see below (we will expand on this many chapters later):

(2a) Ii introduced myselfi
(2b) *Myselfi is so proud of me.


(2a) works as I c-commands myself, as you can see. (2b) doesn't work. As nothing c-commands myself, it cannot refer to anything, and the sentence is thus ungrammatical. In double-object constructions, the second object can be referentially dependent on the first, but not vice versa:

(3a) I showed Maryi herselfi
(3b) *I showed herselfi Maryi

This suggests that Mary c-commands herself in (3a).

Problem 2: Quantifiers


A very similar situation is the quantifier. Refer to the following:

(4a) Every good boyi does hisi homework.
(4b) *Hisi homework is done by every good boyi.


(4b) doesn't work. As every good boy does not c-command his, his does not refer to anything. Now look at double-object constructions:

(5a) He gave every student her essay.
(5b) *He gave its writer every essay.

Again, analogously, every student should c-command her essay in (5a). Note that (5b), if reformulated as He gave every essay to its writer', the sentence would be grammatical.

Problem 3: Negative Polarity Items


Yet another similar problem occurs when we use negative polarity items, such as any:

(6a) Nobody knows anything about linguistics there.
(6b) *Anybody knows nothing about linguistics there.


As you can guess, double object constructions also show that the first object c-commands the second:

(7a) I gave nobody anything.
(7b) *I gave anybody nothing.



There are other cases where the two objects are not created equal, but this will do for now. We can, with the above data, prove the ternary and adjunct hypothesis false:

  • If the adjunct hypothesis were true, the second object would c-command the first object, rather than vice versa, which is the actual situation. The adjunct hypothesis is thus false.
  • If the ternary branching hypothesis were true, the two objects should c-command each other, but it is clear that the second object doesn't c-command the first. Thus the ternary branching hypothesis isn't true, either.

This led to a third hypothesis, the VP shell.

VP-Shell Hypothesis


In light of the evidence above, a new structure for double-object constructions has been introduced, in which the verb originates between the two objects and moves to the front (the structure for VP + NP + NP constructions is actually slightly more complicated than has been shown here, but we're not in a position to reveal the whole story):


With this new construction, we can explain all our problems above. We will give one example, and you can work out the rest on your own:


In gave nobody anything, nobody c-commands anything, so the negative polarity item is grammatical. In gave anybody nothing, anybody is not c-commanded by nothing, so the phrase isn't grammatical.

Further Evidence: Idioms


The VP-Shell Hypothesis is supported by the fact that in some dative constructions where the PP complement always succeeds the NP complement, the VP and PP form an idiom:

(8a) He takes his girlfriend for granted.
(8b) He takes his girlfriend to the cinema.

(8a) and (8b) have the same structure, but different PP complements. The word takes does not mean the same thing in both: take for granted is an idiom in (8a). Under the VP-shell hypothesis, take for granted was originally a single phrase before takes was moved to the front.


Ergative Verbs


The VP-shell hypothesis has been further developed to account for an interesting phenomenon known as ergative verbs. Refer to the examples below:

(9a) The water evaporated.
(9b) He evaporated the water.
(9c) He made the water evaporate.

The VP-internal subject hypothesis states that the subject originates from inside the VP, and moves to the IP's subject position. For now, we'll take this for granted, but we'll return to the hypothesis soon. By this hypothesis, (9a) can be written thus:


As for our (9c), the water evaporate is a VP complement of made:


How about (9b)? This is where the abstract verbal element e comes in. (9b) originally assumes the same structure as (9c), with e as the head of the outer VP. The head of the inner VP then moves and attaches to e:

(10a) He      e       the water evaporated
(10b) He evaporated-e the water


Theta Theory

Government and Binding Theory
Split INFL Hypothesis Printable version A-Movements

In our previous chapters, we have tracked down X-bar structures down to the minutest detail. Yet if we look back, there's something fundamental that our grammar is thus far yet to explain. When we utter this sentence:

(1a) I do my homework.

We know that the speaker does his homework. Yet when we utter this:

(1b) My homework does me.

We will not interpret my homework as the agent and me as the patient, even though (1b) makes no sense at all. This implies that there is something else going on, known as theta theory.

θ-grids and θ-roles


We can see that when we use different verbs, the subject and object will take up different semantic roles. For example:

(2a) I saw him.
(2b) I helped him.

In (2a), I is the experiencer and him is the theme; in (2b), I is the agent and him is the patient. Note that in this book, we are more precise with our terminology: Patients actually receive the action, while themes don't receive anything.

As you have probably guessed, semantic roles are determined in the lexicon, next to verb entries. To differentiate between GB roles and the roles we've learnt in basic semantics, we will call semantic roles theta-roles (θ-roles) and the part of the lexicon dealing with theta roles is called the theta grid.

Theta grids

jump [__ Ø]<agent>
eat [__ (DP)]<agent, patient>
see [__ (PPfor)]<experiencer, theme>

Back to our (2a), saw is known as the predicate because it expresses the state or process in which the arguments, i.e. the sentence elements with thematic roles, are concerned. (This isn't a book on semantics, so we don't need the technical definitions.)

Assigning θ-roles


When a predicate is introduced to our sentence, it assigns θ-roles to arguments. For example, in (2a), saw assigned the θ-role of experiencer to I and the θ-role of theme to him. There's a small problem with this: Sometimes, more than one θ-role can apply to a subject, and the argument itself also plays a role in θ-role determination:

(3a) The mouse triggered an event.
(3b) The user triggered an event.
(4a) John tripped.
(4b) John tripped David.
(Note: In computing, an event is an action detected by a program to be handled by an event handler. For example, when you click on a hyperlink, you trigger an onClick event.)

In (3a), the mouse is an instrument while in (3b), the user is an agent. An event remains the patient. Likewise, in (4a), John is the patient and in (4b), John is the agent. Thus the object is known as the internal argument and its θ-role is assigned by the verb; the subject is the external argument, its θ-role affected by other factors like the object in (4).

As we have seen in (4), David played a role in determining the θ-role of John, but so did tripped. This suggests that it is V′ that assigns the θ-role to John.

Yet there is another question to be asked. The subject is the specifier of the IP, which is a long way from V′. How does assignment occur over such a long distance? See, for example, how see assigns the role of experiencer in I saw him:


Linguists have thus suggested the VP-Internal Subject Hypothesis:

VP-Internal Subject Hypothesis

The subject is the specifier of the VP.

This is what our VP looks like now:


Movement occurs to move the subject from the VP to the IP when we go from D-structure to S-structure:


With the VP-Internal Subject Hypothesis, we notice that theta roles are only assigned to sisters, i.e. nodes under the same mother. This leads to the Sisterhood Condition:

Sisterhood Condition

An element can only assign θ-roles to its sisters.

Also note that, as you can see in this stage, theta role assignment, like X-bar projection, occurs at D-structure. This is a very important concept, as we will soon see in the analysis of Chinese VPs.

Theta Criterion


Apart from what we have seen above, we must note that it is impossible

  • To have an argument sans θ-role: *He jumps me
  • To have two arguments with the same θ-role: *I gave it to John to David
  • to have an argument with two θ-roles: *I bought a pen was blue.

This leads to the theta criterion, which is a principle. In Chomsky's words:

Theta Criterion

Each argument bears one and only one θ-role, and each θ-role is assigned to one and only one argument.

Now that we've learnt θ-theory, we can look at movement.


Government and Binding Theory
Theta Theory and Case Theory I Printable version Ā-Movements

Finally, with the basics of theta theory learnt, we can commence our treatment of movements in the GB framework! Let us begin by examining the sentences we looked at back in our introductory book. We had three sets of sentences:

(1a) It seems that Wikibooks is useful.
(1b) Wikibooks seems to be useful.
(2a) You will try to steal what?
(2b) What will you try to steal?
(3a) Volunteers wrote the Linguistics textbook.
(3b) The Linguistics textbook was written by volunteers.

There is a common denominator among the first and the third, one that you might have noticed back then (which probably feels like eons ago by now.)

The common denominator between these three is that they both involve raising a DP – respectively, Wikibooks and the Linguistics textbook – to subject position. This type of movement is called the A-movement (or NP movement). Firstly, though, we need to look at the important principles that guide movement.



No, UTAH is not the US state. It's the Uniformity of Theta Assignment Hypothesis (UTAH):

Uniform Theta-role Assignment Hypothesis (UTAH)

If two arguments in have the same θ-role in S-structure, then they must have been generated in the same position at D-structure.

Let's use (3a) and (3b) as examples. Volunteers is the agent and the Linguistics textbook is the patient in both sentences. We can thus conclude that they were generated in the same place at D-structure.

Structural Preservation Principle

Structural Preservation Principle

X-bar structures cannot be altered by movement.

This may sound counter-intuitive, since we're transforming a D-structure into an S-structure. There are a few implications we can derive from this:

  • We cannot add new phrases, projections or other structures when we move constituents around.
  • We cannot change them: IPs remain IPs, V′s remain V′s and so on.
  • In movement, what we are doing is to move a constituent from its original location to an unoccupied location. We might, for example, move something from the head position of XP to the unoccupied head position of YP.

Forget about the syntax trees regarding movement that we met in our introductory book. They have no place here.

Since we've learnt these principles now, let us see them in action.



We will discuss three types of A-movements because they are common in English.

The Passive


Let's use our (3a), deleting by volunteers for simplicity's sake:

(3c) D-structure of (3b):            —             was written the Linguistics textbook
(3d) S-structure of (3b): The Linguistics textbook was written           —


  • The active object starts its life in the passive D-structure as an object.
  • The θ-role, patient in this case, is assigned to the object.
  • It is moved to the unoccupied specifier position of the IP to become the subject, with the θ-role retained.

We can also represent this diagrammatically:




Before we talk about raising, we need to take note of a category of predicates that only takes an object. Consider:

Theta grids

seem [__ CP[+fin]]<theme>
appear [__ (CP[+fin])]<theme>
likely [__ (CP[+fin])]<theme>
possible [__ (CP[+fin])]<theme>

Note that likely is also a predicate as it is the one expressing a state, not the be that precedes it. As they do not take subjects, we need a pleonastic subject in its place to satisfy the Extended Projection Principle:

(4a) It appears that I have done my homework.
(4b) It is likely that I have done my homework.
(4c) It is possible that I have done my homework.

However, it seems that for some of these verbs, the pleonastic subject are able to be eliminated by movement (or, rather, the pleonastic subjects of some of these verbs seem to be able to be eliminated by movement):

(5a) I appear to have done my homework.
(5b) I am likely to have done my homework.
(5c) *I am possible to have done my homework.

This is known as raising and it is another type of A-movement. Consider:

(6a) D-structure of (5a):    —    appear I to have done my homework
(6b) S-structure of (6a):    I    appear — to have done my homework 

We can note the following points about this type of movement:

  • The subject of the CP in the original constructions starts life as the subject of the non-finite clause, and is assigned a θ-role by its predicate.
  • The subject then moves to the specifier position of the IP.


Ergative verbs


An ergative verb allows the subject to be omitted and the object to be moved to the subject position, a phenomenon sometimes called the middle voice. Consider:

(7a) I broke the windows. (7b) The windows broke.

We can model this phenomenon as an A-Movement as well:

(7a)      —      broke the windows
(7b) The windows broke      —

We note that the situation is similar to the passive:

  • The active object starts life in the object position. Its θ-role is assigned by the predicate.
  • The active object moves to the subject position.

Let's also map this diagrammatically:


A-movements are everywhere


In the diagrams above, we have deliberately missed out one important part of the puzzle: The specifier of the VP. Recall, from last chapter, that the subject originates in the VP according to the VP-internal subject hypothesis. Thus in our above cases, the subject actually jumps to the specifier position of the VP before proceeding to the specifier position of the IP. Thus A-movement occurs in practically every sentence.

Defining A-movements


Note that so far, all of our movements deal with DPs that have been assigned θ-roles, not just any DPs. We cannot move adjunct DPs:

(9a) Colourless green ideas sleep furiously all day long.
(9b) *All day long sleep furiously.

Note that although all day long may have a semantic role in our regular definition, it lacks a θ-role in GB theory.

This leads us to our definition of A-movements:


An argument with a θ-role is moved to the subject position.

Yet we know that DPs aren't the only elements that move - adjuncts, for example, are perfectly capable of moving. What happens then?


Government and Binding Theory
A-Movements Printable version Head Movements

There was a pair we left out just now:

(1a) You will try to steal what?
(1b) What will you try to steal?

There are actually two movements in play. The first is the movement of what to the beginning of sentence, and the second is the inversion of you and will. If you're a visual person, here's what the D- and S-structures look like:

(2a) D-structure of (1b):  —     —  you will try to steal what
(2b) S-structure of (1b): what will you   —  try to steal  —

As you can tell from the distance travelled by will and what, these are two different types of inversion that have to be dealt with separately. We discuss the movement of what in this chapter. It is a form of Ā-movement or Wh-movement.



First, we have to make sure we understand what the wh-elements really are. In English, there are several wh-elements, used in interrogatives and relative clauses:

  • DPs: what, which + noun, who...
  • PPs: at which + noun, by whom + noun...
  • APs: how informed, how silly...

It is important to note that not all of these are arguments. Quite a few of them can actually be adjuncts. Consider the following examples: (3a) Why are you here?
(3b) When did you come to school?
(3c) This isn't the road by which he came.

Movements which bring the preposition along are called Pied Piper movements.

Regardless of whether they are arguments or not, wh-elements all move to the beginning of the clause, a non-argument position. This type of movement is the Ā-movement:


A type of movement whereby any phrase moves to a non-argument position.

We know that these movements are to non-argument positions, but there remains a question: Where exactly are these positions? We've filled up the IP's specifier position with the subject, so it can't be this. Well, remember that above the IP, we still have the CP structure. The head and specifier spots of the CP are open. We can conclude that the wh-element moves to the specifier position. In the next chapter, we will see that the auxiliary moves to the head position of the CP - let's leave that for now. Diagrammatically:


Note on other languages


Note that some languages do not have wh-movements. The existence of wh-movements is parametric. Consider this Cantonese example: (6a) Nei heoi-zo bindou?
(You go-asp. where?)
(6b) *Bindou heoi-zo nei?
(Where go-asp. you?)

Focus and Topic


There is another type of Ā-movement worthy of attention:

(7a) This book, I've read a hundred times already.
(7b) Scarcely had I finished my meal when the phone rang.

Yet it would be problematic to move these to complementiser specifier position as they come after the complementiser:

(8a) I told him that this book, I'd read a hundred times already.
(8b) It's unfortunate that scarcely had I finished my meal when the phone rang.

This gave rise to the Articulated CP Hypothesis.




Head Movements

Government and Binding Theory
Ā-Movements Printable version Trace Theory

In interrogatives, if we move wh-elements to the beginning of the sentence, we need to move the auxiliary verb or be after it as well. Yes-no questions have a similar requirement. If there is no auxiliary verb, then English uses the do-insertion process to inject one:

(1a) You will go where? ⇒ Where will you go?
(1b) You do want what from me? ⇒ What do you want from me?
(1c) You are crazy? ⇒ Are you crazy?

The subject-verb inversion involved is called head movement as only the head is moved to the beginning: the rest of the phrase is untouched. It is contrasted with phrase movement, which describes the movements we've seen prior to this chapter.

Head movements

The movement of the head of a phrase, rather than entire phrase.

I and V movements


The movement of English inflectional suffixes is a kind of head movement:


As is the movement of French verbs:


Subject-auxiliary inversion


Subject-auxiliary inversion in English moves the auxiliary in C position:




In English, only auxiliaries can be moved through head movement. Lexical verbs cannot:

(2a) *Know you how to play football?

Note that other languages, such as French, allow lexical verbs to be moved to the front, and lack do-insertion. Consider this example:

(2b) Savez-vous jouer au football ?
(Know you to play to the football?)

We can explain this easily as the French verb can be moved out of the VP while the English verb cannot:


In our discussion of the head parameter, we have discussed how word orders other than SVO can be derived. In our discussion of the articulated CP hypothesis, we covered more fluid word orders like Italian as well. However, we are yet to touch on V2 (Verb Second) word order, which is found in the Germanic Languages:

  • The verb is always the second element in the sentence.
  • The element to be emphasised comes first, and it can be a subject, an object or even an adjunct.

Trace Theory

Government and Binding Theory
Head Movements Printable version Case Theory

Refer to the following sentence:

(1a) Who do you think [IP went to the cinema]?


Here, who was moved to an Ā-position, and is no longer present in the original IP. Thus the original IP appears to violated the Extended Projection Principle. Since (1a) is grammatical, we can propose that there is a subject in the IP, just not phonetically realised. Do you remember it? Yes, we've seen this in the introduction to our introductory book on linguistics!

Evidence for trace theory


Apart from the Extended Projection Principle, other pieces of evidence point to traces too.



Refer to the following:

(2a) They compliment one another.
(2b) *I compliment one another.
(2c) It seems that they always compliment one another.
(2d) They seem [IP to compliment one another].
(2e) Who do you suppose [IP always compliment one another]?

As we can see from (2a) and (2b), the reciprocal pronoun one another only appears in the complement position of a verb when the subject is plural. To account for the grammaticality of (2d) and (2e), a null subject is needed.

Here is another example if you are not thoroughly convinced:

(3a) He always compliments himself.
(3b) *They always compliment himself.
(3c) He always seems to compliment himself.

Predicative APs


A predicative adjective modifies the subject, but is the complement of the verb:

(4a) I am growing old.
(4b) The joke is growing old.
(4c) I seem [IP to be growing old].
(4d) Who do you think [IP is growing old]?

In (4a), old modifies I. In (4b), old modifies the joke. How about (4c)? How is it possible that we have a predicative adjective, and yet there is no phonetically realised subject? This is another piece of evidence supporting the existence of traces.

Impossibility of replacement


It is not possible for an element to move into a position left behind by a moved element. Take, for example:

(5a)             he could not have done what
(5b)       could he       not have done what
(5c)  what could he       not have done
(5d) *what could he have  not      done

As (5d) is ungrammatical, have could not have just moved into the blank I position left by could. There must have been something there blocking this movement.



The existence of traces poses a problem for theta theory, as demonstrated below:

(6a) Whomi didj you tj see ti at school?

According to the Theta Criterion, the relationship between theta roles and arguments must be bijective. However, the theme role is assigned to both whom and ti, apparently violating the Theta Criterion. The solution is to see the antecedent and traces to be part of a chain, i.e. <Whomi, ti>. We then rewrite the Theta Criterion thus:

Theta Criterion

Each argument chain bears one and only one θ-role, and each θ-role is assigned to one and only one argument chain.

Note that antecedents are always found in θ′-positions, while the foot of the chain, or the base position, is always found in a θ-position.

Note that adjunct wh-elements, while not subject to the Theta Criterion, also leave traces:

(6b) Wheni willj you tj go to school i?

Formal definition of chains


<x1, ... xn> is a chain iff xi is the local binder of xi+1 for all i between 1 and n inclusive.

The definition of binding will be discussed in detail later. For now, we will say that it is a structural relationship based on c-command, as we have seen in our discussion of VP shells:

(2a) Ii introduced myselfi
(2b) *Myselfi is so proud of me.

I binds myself, but me does not bind myself here. We will also define local binders as follows:

Local Binder

α is the local binder of β iff α binds β and there is no γ such that γ binds β but not α.


Government and Binding Theory
Trace Theory Printable version Barriers

Despite what we've seen last time, movement is not unrestricted. It is not always possible to move anything anywhere. Consider this example:

(1a) You did do what? ⇒ What did you do?
(1b) John did think you did what? ⇒ What did John think you do?
(1c) John asked if Mary thought I had really stolen what? ⇒ *What did John ask if Mary thought I had really stolen?

This shows that there seems to be restrictions on how far elements can move, leading to bounding theory.



Although movement is clearly restricted, it is often hard to pinpoint on a specific node that prevents movement. As we have seen above, moving a wh-element to the front is grammatical across one phrase, but not across two. Subjacency is a proposed principle that explains why movements of certain lengths cannot occur.

Subjacency Principle

Movements can cross a maximum of one bounding node.

What counts as a bounding node is parameterised across languages. Consider this ungrammatical example:

(2a) [CP [IP   —     —  [IP John did2 ask [CP if Mary thought [CP [IP I had really stolen what1]]]]?
(2b) [CP [IP What1 did2 [IP John  t2  ask [CP if Mary thought [CP [IP I had really stolen  t1  ]]]]?

In our example, what1 must have jumped across two bounding nodes, which is unacceptable. The question remains: Is the bounding node CP or IP? We can consider these two examples, one of an interrogative and one of a relative clause:

(3a) [CP   —     —  [IP John did2 ask [CP if [IP I had really stolen what1]]]]?
(3b) [CP What1 did2 [IP John  t2  ask [CP if [IP I had really stolen  t1  ]]]]?
(3c) [CP          —          [IP I wonder [CP whether [IP it is impossible [IP PRO to judge [DP whose ability]1]]]]
(3d) [CP [DP whose ability]1 [IP I wonder [CP whether [IP it is impossible [IP PRO to judge        t1          ]]]]]

Here, there was only one CP node crossed, but multiple IP nodes. The resulting sentence is wrong. Clearly, IP, not CP, matters.

However, note that sometimes multiple movements occur. In this case, as long as each movement only hops across one bounding node, the sentence is fine:

(4a) [CP   —     —  [IP Mary did2 think [CP    —  that  [IP I had really stolen what1]]]]?
(4b) [CP   —     —  [IP Mary did2 think [CP what1 that  [IP I had really stolen  t1  ]]]]?
(4c) [CP What1 did2 [IP Mary  t2  think [CP   t1   that  [IP I had really stolen  t1  ]]]]?

Note that what first moves to the specifier position of the CP it's in, crossing one IP node on the way. Then it moves to the specifier position of the greater CP, leaving a trace in the original CP and again crossing an IP node.

IPs are not the only bounding nodes in English. DP exhibits similar properties.

(5a) [CP   —   —   [IP you do2 disagree [PP with [DP the criticism [PP of [DP John's discovery of what1]]]]]]
(5b) [CP   —   —   [IP you  t2 disagree [PP with [DP the criticism [PP of [DP John's discovery of  t1  ]]]]]]
(5c) [CP What1 do2 [IP you  t2 disagree [PP with [DP the criticism [PP of [DP John's discovery of  t1  ]]]]]]

It is thus said that IPs and DPs are the bounding nodes of English.


Government and Binding Theory
Subjacency Printable version Relativised Minimality

Yet Subjacency is not sufficient to explain everything. Consider these examples:

(6a) [CP   —   —   [IP [CP   —   that John has found what1] is2 strange]]
(6b) [CP   —   —   [IP [CP what1 that John has found  t1  ] is2 strange]]
(6c) [CP What1 is2 [IP [CP   t1  that John has found  t1  ]  t2 strange]]
(7a) [CP   —   —   [IP I do2 think [CP   —   [IP John has found what1]]
(7b) [CP   —   —   [IP I do2 think [CP what1 [IP John has found  t1  ]]
(7c) [CP What1 do2 [IP I  t2 think [CP  t1   [IP John has found  t1  ]]

(6) and (7) are essentially the same story. An element moved across an IP node, left a trace and moved cross another IP node. Under Subjacency, both are correct. However, a native speaker can intuitively tell that (6) is categorically wrong, and (7) is correct. There is but one fundamental difference between the two: in 5a, what is moved from the subject position, and in 5d, what is extracted from the complement position.



Chomsky has thus suggested the Barriers theory, which explains why it is easier to move an element out of a complement than a subject:

Bounding Principle under Barriers
  • A constituent is L-marked if it is subcategorised by a lexical head.
  • A constituent is a Blocking Category if it is not L-marked.
  • A constituent is a barrier if it is a Blocking Category other than IP.
  • A CP inherits barrierhood from the IP and it is a barrier unless its specifier position is filled, in which case the it is not a barrier if not L-marked.
  • A movement may not move across a barrier.

We can look at our previous examples to verify this principle. In (6), the CP is not subcategorised by the head but by its first projection, so it is not L-marked and will be a blocking category under any situation, preventing movement. In (7), the IPs are of course not barriers. The CP is subcategorised by the head, so after what moves to its specifier position, there aren't any barriers in sight. The movement is deemed grammatical.

Relativised Minimality

Government and Binding Theory
Barriers Printable version Case Theory

Relativised Minimality


Move α


After having seen so many rules regarding movement, it's about time we rid ourselves of another outdated concept we've grown out of: transformation rules. Rather than giving concrete rules on how constituents are moved, we now prefer to assume at first that we can move anything, anywhere, then use a set of rules that restrict movement. Since we've already done the boot thing, let's try something new. Let's use... Donald Trump.

You're fired.

Obviously, movement doesn't just happen all the time as long as they conform to bounding theory. There needs to be 'motivation' for movement. Case theory gives this motivation...

Case Theory

Government and Binding Theory
Bounding Theory Printable version Case Filter

After having visited the complex approaches to bounding theory, we now dive into the weird and wonderful world that is Case Theory. GB theory has taken the concept of case from traditional grammar, and utilised it to explain many grammatical phenomena in syntax.

However, a distinction has to be made between morphological case, which is marked by inflection, and with abstract case. Morphological case is absent in English except in the cases of certain pronouns (I/me, who/whom, etc.). In other nominal expressions, it is not. Consider:

(1a) The dog bit him.
(1b) The dog bit the boy.

In (1a), him is marked as accusative, but not the boy in (1b). Only the him has morphological case, but both have abstract case.

Inherent and structural case


Firstly, we need to make sure we know what a Case is. Recall from our basic linguistics book that case is a grammatical property of a noun (now a DP) that usually correlates with the role of a verb in a sentence. Consider the French examples below:

(2a) Je le tue.
(I him kill)
(2b) Je lui obéis.
(I him obey)
(2c) Je parle de lui.
(I talk about him.)

The Case of him in (2a) and (2b) depends here on the verb: tuer (accusative) and obéir (dative). Thus we can tell that it is the verb that determines Case here. In (2c), de also determines the disjunctive Case of lui. Thus prepositions can also assign Case.

In Case Theory, we also draw a distinction between inherent Case and structural Case. The former is determined by idiosyncratic properties of the verb. For example, in (2b) obéir assigned the dative Case to lui. The latter applies to all structural positions; for example, all verbs assign accusative Case to the object, direct or indirect. Inherent Case is assigned at D-structure; structural Case is assigned as S-structure.

Inherent Case

Inherent Case is assigned by α to a DP iff α θ-marks the DP.

Assigning case


Marking by inflection


Since inherent Case is dependent on θ-marking, little has to be said of its assignment. However, we still need to explore the assignment of structural case.

Let's look at nominal case. It does not depend on the idiosyncrasies of individual verbs, so it's a structural case. However, it is not always assigned to subjects:

(3a) I wanted him to finish his homework.
(3b) It would be weird of me to do my homework.

In (3), him and me are clearly subjects, and yet they are accusative! What is going on? Note that both the clauses with me are non-finite, which means it is the finite inflection that assigns nominative Case. In (3a) and (3b), of and wanted assigned accusative Case instead.

Here is an example of nominative case assignment:


Marking by complementiser


Complementisers are also capable of assigning Case. Consider:

(4a) I wish for him to be able to find his parents.

Here, for is a complementiser for the IP him to be able to find his parents. As we have seen above, the non-finite clause does not assign nominative case to him; instead, this is handled by the complementiser for.

m-command and government


So far, we have seen the following:

  1. Verbs and prepositions can assign Case to their complements (2)
  2. Verbs can assign Case to the subjects of their IP complements (3a)
  3. Complementisers can assign Case to the subject of their IP complements (4a)
  4. The finite inflection can assign Case to the subject

How can we explain these? Pro tem, we will ignore the second case and focus on the rest. The notion of government is central to GB theory, and it beautifully captures our first, third and fourth situations. Firstly, however, we need to take care of the concept of m-command.


α m-commands β iff:

  • α does not dominate β,
  • α does not dominate β, and
  • the first maximal projection dominates α and β.

Next, we have to define what a governor, or an element that governs, is.


All lexical heads and the finite inflection are governors.

As for the formal definition of the government, this harks back to bounding theory, which we've seen last chapter. This is the usual definition of government:


α governs β iff

  • α is a governor,
  • α m-commands β, and
  • no barrier intervenes between α and β.

Under the relativised minimality version of bounding, the last line would be rewritten as 'there is no closer governor to β than α.

Government in Case Theory


Now that we've laid out our framework, let's test-drive it on the three cases captured by government, to see how they explain case assignment.

Case 1: V and P to complements



  1. killed is a V, a lexical head.
  2. killed m-commands me:
    1. killed and me are sisters, so neither dominates the other.
    2. The first maximal projection of killed is the VP, which dominates me.
  3. There are no barriers between the V and DP (obviously, since there's nothing between them)

Thus killed assigns Case to me.

Case 2: Finite inflection to subject



  1. Ø is an I, a finite inflection.
  2. Ø m-commands I:
    1. I is the specifier and Ø is the head. Neither dominates the other.
    2. The first maximal projection of Ø is the IP, which dominates I.
  3. There are no barriers between the V and DP (again, obviously, since there's nothing between them)

Thus Ø assigns Case to I.

Case 3: C to specifier of IP



  1. for is an P (a prepositional complementiser), a lexical head.
  2. for m-commands me:
    1. me is dominated by IP, which is a sister of for. Neither dominates the other.
    2. The first maximal projection of for is the CP, which dominates me.
  3. There are no barriers between the C and the IP (IP is not a barrier)

Thus for assigns Case to me.

Exceptional Case-Marking (ECM)


Most of the time, verbs, as we know them, cannot have IPs as complements; they must have complementisers to help the IP, well, become a complement. We have learnt this back in our introductory book on linguistics. Yet (3a) appears to defy this. Let's copy (3a) here to study it more closely:

(5a) I wanted him to finish his homework.

Could it be that there's an omitted complementiser, as in I think (that) he has done his homework? Let's try that:

(5b) *I wanted for him to finish his homework.

That (5b) is ungrammatical tells us a lot about this sentence, as we now know that want can take an IP complement. These verbs are exceptional verbs. With this in mind, we can analyse the case marking in this sentence (the tree leaves out unnecessary data):


  1. wanted is an V (a prepositional complementiser), a lexical head.
  2. wanted m-commands him:
    1. him is dominated by IP, which is a sister of wanted. Neither dominates the other.
    2. The first maximal projection of wanted is the VP, which dominates me.
  3. There are no barriers between the V and DP (IP is not a barrier)

Thus wanted assigns Case to him. This is Exceptional Case-Marking.

Note that this structure is actually not very common. In French, for example, wanted takes a CP complement:

(6) Je veux que tu finisses tes devoirs.
(I want that you finish your homeworks)

Case Filter

Government and Binding Theory
Case Theory Printable version Control Theory

An important concept in Case Theory is the Case Filter. It posits that all DPs must have abstract case.

Case Filter

Every phonetically realised DP must have abstract Case.

This explains why DPs and APs never take PP complements:

(1a) *the help John
(1b) *happy the incident

Does this remind you of anything? Yes, we briefly mentioned this puzzle near the beginning of the book!

The case for the Case Filter




Principle of Adjacency


Case Filter and A-Movements


A-movements are motivated by the Case Filter. Let's use an example.

(2a) D-structure of (3b):  — was killed he
(2b) S-structure of (3b): he was killed —

To satisfy the Extended Projection Principle, it seems that we can simply fill up the subject position of (2c) using a pleonastic it, but this is not possible:

(2c) *It was killed he

Why is this? The Case Filter comes to the rescue. he cannot be assigned Case in (2a) because passive verbs cannot assign accusative Case, as we will see later. Thus (2c) is not possible. Instead, the DP in object position moves to subject position in S-Structure, where the inflection assigns nominative Case to it.

Burzio's generalisation


Burzio's generalisation captures the properties of unaccusative, raising and passive verbs at once:

Burzio's generalisation
  • A verb without an external argument cannot assign accusative Case.
  • A verb that fails to assign accusative Case does not have an external argument.

Burzio's generalisation relates the θ-marking and case-assigning properties of verbs. Verbs that can assign accusative Case do not have external arguments, and vice versa. This necessitates movement:

Verb Internal argument External argument Nominative Case Accusative Case
Transitive (e.g. take)
Intransitive (e.g. jump)
Unaccusative (e.g. come), ergative verbs (e.g. melt) passive (e.g. be eaten), raising (e.g. seem)

You may be wondering how intransitive verbs assign accusative Case. The answer lies in cognate objects. Intransitive verbs can carry cognate objects, but not unaccusative, ergative, passive or raising verbs:

(3a) He lives a great life.
(3b) *He came a great coming.
(3c) *He melted a great melt. (3d) *He was killed a great kill.
(3e) *He appeared a great appearance.

We can now also explain raising verbs:

(4a) It seemed that he had finished his homework.
(4b) *It seemed that he to have finished his homework.
(4c) He seemed to have finished his homework.

(5a) It is thought that he had finished his homework.
(5b) *It is thought that he to have finished his homework.
(5c) He is thought to have finished his homework.

In (4a) and (5a), the finite inflection assigned nominative Case to he, so it's all good. We used a pleonastic subject as no external argument could be assigned. In (4b) and (5b), to cannot assign nominative Case to he. He violated the Case filter, resulting in (4c) and (5c), in which the subject of the lower IP moved to the beginning.

Case Filter and ECM


Control Theory

Government and Binding Theory
Case Filter Printable version The pro-drop Parameter

Many chapters ago, we mentioned the Extended Projection Principle, which requires all clauses to contain subjects. However, this leads to a problem: Are most non-finite clauses exempt?

(1a) I want [you to study hard].
(1b) I want [IP to study hard].
(1c) [IP Calling me an idiot] is not a means of constructive discussion.

In (1a), to study hard had the subject you. In (1b) and (1c), however, to study hard and calling me an idiot lacked subjects. Are they exempt from the EPP?

The answer is 'no' because in GB Theory, linguists believe that PRO exists. It is a null subject, i.e. it is not phonetically realised.

Evidence for PRO


Different subjects


We know that the subcategorisation frame and theta grid of reserve look something like this:

reserve [__ DP (PPfor)]<agent, patient, receiver>

Refer to the following:

(2a) It was nice of you [IP to reserve a seat for me].
(2b) He tried [IP to reserve a seat for me].

In (2a), it was clear that the agent was you, while in (2b), it was clear that the agent was he. This hints that there's a non-phonetically-realised subject that acts as the subject of the IP.



Refer to the following:

(3a) They always compliment one another.
(3b) *I always compliment one another.
(3c) It is silly to compliment one another.

As we can see from (3a) and (3b), the reciprocal pronoun one another only appears in the complement position of a verb when the subject is plural. To account for the grammaticality of (3c), a null subject is needed.

Here is another example if you are not thoroughly convinced:

(3a) He always compliments himself.
(3b) *They always compliment himself.
(3c) It is silly to compliment himself.

Predicative APs


A predicative adjective modifies the subject, but is the complement of the verb:

(3a) I am growing old.
(3b) The joke is growing old.
(3c) [IP To grow old] is to gain experience.

In (3a), old modifies I. In (3b), old modifies the joke. How about (3c)? How is it possible that we have a predicative adjective, and yet there is no phonetically realised subject. This is another piece of evidence supporting the existence of PRO.

PRO theorem


PRO can only appear in subject positions of non-finite clauses. You cannot use it anywhere else:

(4a) *I think PRO has not done much good.
(4b) *He cannot relate to PRO.
(4c) *Don't give him PRO.

Is there anything that the subject of finite clause, the VP complement and the PP complement have in common...? Think about this before you read the theorem below.

PRO Theorem

PRO may only appear in ungoverned positions.

Yes, the explanation is that simple!


Wait... There's something wrong with this. Doesn't the higher IP govern PRO? Okay, let's modify our analysis a bit:

The significance of the PRO theorem will be even clearer in the next chapter.

Types of control


So far, we've postulated the existence of PRO and certain properties thereof, but we haven't really got started on how PRO is determined. This is the role of control theory.

Subject control and object control


Consider the following sets of sentences:

(5a) I want [IP you to do lots of homework].
(5b) I convinced you [IP PRO to do lots of homework].
(5c) I told you [IP PRO to do lots of homework].

(6a) I promise you [IP PRO to do lots of homework].
(6b) I am willing [IP PRO to do lots of homework].
(6c) I decided [IP PRO to do lots of homework].

In (5), PRO refers to the object you, while in (6), PRO refers to the subject I. Subject control occurs in (5), and object control in (6):

(5a) I want [IP you to do lots of homework].
(5b) I convinced youi [IP PROi to do lots of homework].
(5c) I told youi [IP PROi to do lots of homework].

(6a) Ii promise you [IP PROi to do lots of homework].
(6b) Ii am willing [IP PROi to do lots of homework].
(6c) Ii decided [IP PROi to do lots of homework].

Arbitrary control


Refer to the following example:

(7a) [IP PRO to be or PRO not to be] is the question.

It is hard to tell who PRO is. It could, in fact, be anyone. This is arbitrary control.

Obligatory control and optional control


Sometimes, it is not entirely clear whether a verb has subject or arbitrary control. Refer to the following examples:

(8a) I asked him how [IP PRO to study linguistics]. (8b) He thinks it's proper [IP PRO to call a teacher by his first name].

In (8), PRO can be both I or anyone. This is called optional control. We can demonstrate this by rewriting the sentences after Huang (1989) (it will be obvious in the next chapter why we used reflexives):

(9a) I asked him how [IP PRO to behave myself/oneself]. (9b) I think it's improper [IP PRO to make a fool out of myself/oneself].

This is not true of other sentences, as we can see by rewriting (5) and (6):

(10a) I want [IP you to behave yourself/*oneself].
(10b) I convinced you [IP to behave yourself/*oneself].
(10c) I told you [IP PRO to behave yourself/*oneself].

(11a) I promise you [IP PRO to do lots of homework].
(11b) I am willing [IP PRO to do lots of homework].
(11c) I decided [IP PRO to do lots of homework].

This is called obligatory control.



Is there a pattern here? Some linguists, specifically Williams (1980), suggests that obligatory control occurs if the controller c-commands PRO.

This is demonstrated by the following examples:

(12a) I want to be happy. (12b) I told him to be quiet. (12c) To smile nicely is good. (12d) To present him is my honour.


More about control


Argument control


Control under different sentence structures


PRO traces


We wrap up the chapter by looking at the idea that PRO can move away and leave traces. Consider the following example:

(13) All he wants is PROi to be loved ti.

Here, PRO moved from object position to the subject of the non-finite clause in an A-movement.

The pro-drop Parameter

Government and Binding Theory
Control Theory Printable version Binding Theory

Apart from PRO, which we have seen in the last chapter, there's another type of null subject. For example, in written Finnish, first and second-person subject pronouns are generally omitted (although they are intact in speech):


Personal pronouns
Finnish English
minä I
sinä you
hän he or she
me we
te you
he they
Te you

The dropping of minä, sinä, me and te is not the same as the dropping of PRO:

  • The dropping is optional.
  • If the resulting null subject were PRO, the PRO theorem would be violated.
  • The resulting null subject is not subject to control.

This led to the introduction of another null subject known as little pro. pro-drop does not apply to all languages though. For example, English and French do not allow it:

(2a) *Is the best professor in the field of theoretical linguistics.
(2b) *Est le meilleur professeur dans le domaine de la linguistique théorique.

Properties of pro-drop languages


Rizzi's other properties


Rizzi listed several properties common to pro-drop languages:

  • No pleonastic subjects
  • Subject-VP inversion is possible
  • wh-elements can be extracted more easily in certain situations

Let's examine these properties one by one. Chinese, which is pro-drop, does not have pleonastic subjects, such as the following example in Mandarin:

(3a) Xia yu-le.
(Fall rain-asp.)
(It has rained.)
[Note: It is controversial in linguistics whether 'xia yu' is a word or a phrase, as it has the properties of both. This is irrelevant to the present discussion.]
(3b) *Has rained.

Although Italian is primarily SVO, it allows subject-VP inversion, as shown in Rizzi's own example below, while French generally doesn't:

(4a) Ha telefonato Gianni.
(has telephoned Gianni)
(Gianni has telephoned)
(4b) *A téléphoné Jean.
(has telephoned Jean)
[Note: French allows subject-verb inversions in certain cases like relative clauses, but this is again irrelevant.]

Licensing pro


Huang's modification


Binding Theory

Government and Binding Theory
The pro-drop Parameter Printable version Subjacency

So far, we have not seriously discussed the relationships of pronouns to their antescedents, although we have touched upon the subject slightly, such as this example:

(1a) Ii introduced myselfi
(1b) *Myselfi is so proud of me.

This phenomenon is actually part of binding theory, which explores such relationships in detail, and explains the ungrammaticality of sentences like (1b).



In our discussion of the VP-Shell Hypothesis, we noticed that (1a) is grammatical because I c-commands myself, and (1b) is ungrammatical because me does not c-command myself.


The relationship between the two is known as binding: I binds myself. However, c-command is not the sole criterion that determines binding:

(2a) I hate homework.
(2b) I hate myself.

In (2a), I c-commands homework, but obviously does not bind it. In (2b), I c-commands myself and binds it. We can explain this using the concept of indexation:

(2a) Ii hate homeworkj.
(2b) Ii hate myselfi.


α binds β if:

  • α c-commands β
  • α and β are co-indexed

Governing categories


Binding does not appear to be a sufficient explanation for reflexives like myself, however. Refer to this:

(3a) *Ii think he hates myselfi.


Although I c-commands myself and the two are co-indexed, (3a) is not grammatical. This suggests a locality constraint on the grammaticality of reflexives. Reflexives bound by antecedents too far away – outside of some local domain – are not grammatical. However, that is a weasel word. What exactly is a local domain?

Clause-Mate Condition


This observation led to the Clause-Mate Condition, which, however, does not stand up to scrutiny.

Clause-Mate Condition

Reflexives must be bound by its antecedent in the same clause.

Exceptional case-marking (ECM) sentences, for example, are an exception:

(3b) [IP I want [IP myself to be happy]].

In (3b), myself is the subject of the lower-level IP, while I is the subject of the higher-level IP, yet the sentence is grammatical.



We can further examine using these examples:

(4a) [IP I want [IP myself to be happy]].
(4b) *[IP I think [IP myself is happy]].
(4c) *[IP I want [IP someone to love myself]].


In (4c), myself is governed by love and in (4b), myself is governed by the finite inflection of is. However, in (4a), myself is governed by want. This suggests that the local domain of a reflexive is the smallest clause containing its governor. (We do not go into the details of government here; refer back to the chapter Case Theory.)



Yet another example complicates the situation:

(5a) *He disapproved of the way by which [IP the government treated himself].
(5b) *He disapproved of [DP the government's treatment of himself].

(5b) is grammatical according to our previous assumption, since himself and he belong to the same clause where himself's governor on is found. Yet it is not. This suggests that the DP also limits binding. We have seen in our discussion on DPs that DP and IP structure share significant similarities. In fact, the possessor in a DP can be said to be the subject of a DP, so (5a) and (5b) are ungrammatical for similar reasons:


By incorporating the notion of subject in our definition of the local domain, we come to what is called the complete functional complex (CFC).

Complete Functional Complex

The complete functional complex is the smallest domain that contains

  • A DP
  • The governor of the DP
  • A subject

Even the concept of CFCs does not fully capture all the possibilities of binding, however. Further modifications have been made, and the resulting domain is most commonly known as the governing category. We will not go into the intricacies of the governing category here, and will hereafter refer to the local domain as the governing category without regards to these intricacies.

Exploring the four cases


Typology of DPs


DPs can be divided into four types according to two features: [anaphoric] and [pronominal].

[anaphoric] [pronominal] Symbol Name of empty category Corresponding overt noun type
- - t Ā-trace R-expression
- + pro little Pro pronoun
+ - t A-trace anaphor
+ + PRO big Pro /

In the sections below, we will look at each type and its corresponding principle in Binding Theory.

Anaphors and A-traces


In GB theory, anaphors refers not to pronouns in general, but to reflexives and reciprocals. In English, this includes myself, yourself, each other, one another and so on. Anaphors and Ā-traces have the features [+anaphoric], [-pronominal] and are subject to Principle A of Binding Theory:

Principle A of Binding Theory

Anaphors must be bound in their governing categories.

Here are some examples:

(6a) [GC Ii introduced myselfi].
(6b) *Ii think [GC he introduced myselfi].
(6c) [GC Hei seemed ti to be unhappy].
(6d) *Hei seemed [GC ti was unhappy].

We will not go into (6a) and (6b) as our explanations above were sufficient. The entirety of (6c) is a GC as a to was infinite and thus unable to govern the trace. Instead, the whole sentence was a GC with the subject hei and the governor seemed. (6d) was ungrammatical as the trace was not bound in the IP ti was unhappy, in which the finite inflection of was governs the trace.

Pronouns and pro


In GB Theory, pronouns excludes reflexives, and includes pronouns like it and him. They act in the same way as pro, since pro appears when these pronouns are left out. Both are [-anaphoric] and [+pronominal].

Principle B of Binding Theory

Pronouns must be free in their governing categories.

(7a) [GC I introduced him].
(7b) *[GC Hei introduced himi].
(7c) Zhe-ge youxii hen haowan, [GC wo keyi jieshao proi gei ni].
(this-classif. game very fun, I can introduce pro to you)
(This game is fun; I can introduce it to you.)
(7d) *[GC Tai jieshao-le proi] hou, women jiu kaishi chifan-le.
(he introduce-asp. pro after, we then started eat-asp.)
(After he introduced himself, we started eating.)

In (7a), both I and him were free. In (7b), him was bound to he, which is impermissible. In the Mandarin sentence (7c), pro referred to this game, and was free in its governing category. In (7d), pro was bound by ta in its governing category, and was thus ungrammatical.

Note, however, that (7e) is grammatical:

(7e) Myi mother introduced mei.

This is because my does not c-command me:


R-expressions and Ā-traces


An r-expression has no antecedent. It refers directly to something in the real world, like Wikibooks, Barack Obama, those colourless green ideas and the sandcastle I just built. Both r-expressions and Ā-traces are [-pronominal] and [-anaphor], and therefore subject to Principle C of Binding Theory:

Principle C of Binding Theory

R-expressions and Ā-traces must be free everywhere.

(8a) Ii dare say there is absolutely not a living soul on the earth [GC who does not hate Derrickj].
(8b) *Ii dare say there is absolutely not a living soul on the earth [GC who does not hate Derricki].
(8c) Whati doj you tj think he has done ti?

Although Derrick was free in its governing category in (8b), it was not free everywhere: I still c-commands and is co-indexed with Derrick, so Derrick is bound by I and the sentence is thus ungrammatical.

As PRO is [+anaphoric] and [+pronominal], it faces two conflicting principles regarding its governing category: It must be bound and free in its governing category at the same time. This is impossible... unless it doesn't have a governing category in the first place. [+anaphoric] and [+pronominal] explains the PRO theorem, which states that it must appear in ungoverned positions.

Logical Form

Government and Binding Theory
Binding Theory Printable version

In the book, we have discussed syntax in great detail, without touching on semantics. However, there is an interface between semantics and syntax, so this isn't something we can simply discard - without semantics, syntax would be of no use. Thus far, we have discussed two forms, the D-structure and the S-structure. It may be tempting to think that one of these is the semantic interface, but we can see that this is not the case. Scope ambiguity, for example, often occurs in English. Refer to the following example:

(1) Everybody heard something.

In (1), there is no movement between D- and S-structures, so the argument applies to both. Let H be a predicate '____ heard ____', T be a predicate '____ is a thing', and P be a predicate '____ is a human'. There are two possible interpretations:

(2a) ∃x(Tx&∀y(HyHyx))
(2b) ∀x(Hx→∃y(Ty&Hxy))

If neither D-structure nor S-structure is the syntax-semantics interface, then what?



Linguists have proposed a third syntactic layer of representation that serves as the interface between the syntax and semantics. This is the Logical Form (LF). Quantifiers are moved to the front in the logical form. Refer to the following example (let i be a constant meaning whoever I is, and let E be a predicate '____ ate ____'):

(3a) S-structure: I ate everything. (3b) LF: Everything I ate. (3c) ∀x(TxEix) (3d) For every x, where x is a thing, I ate x.

The S-structure (3a) is transformed into the LF (3b), which in turn is interpreted by our conceptual-intentional system as the semantic representation (3c).

Moreover, the S-structure is transformed into another form, the Phonetic Form (PF), which is interpreted by our sensorimotor system so that we can read out the word in our minds. This led to the T-model of transformation below:




ECP and the Logical Form