The human brain is an amazing piece of work. Every time you utter a sound, or hear one, there are dozens of things that happen subconsciously and take the sound and reduce it to one of several distinct sounds that we use in our language. The problem is that these distinct sounds are different in different languages. When you come into a new language and you hear a sound you're not used to, you automatically try to fit it into one of your previous categories of sounds. This can cause interesting problems.

Let's illustrate this with a (slightly-hypothetical) analogy. There is one group of people from the Land of Men, and another from the Land of Women. In the Land of Men there are only a few colours: red, blue, brown, yellow, pink, green, and a few more. In the Land of Women, however, there are many more: chartreuse, magenta, terracotta, viridian, lavender rose, etc. Whole books could be written about the colours in the land of women, and indeed, some have.

When the men visit the Land of Women, they have no end of trouble. You see their road signs are colour-coded. The women have no problem with this. Their stop signs are rust-coloured and their yield signs are painted in auburn. Now the men, they look at both of these colours and see brown. So as far as they can tell, all stop signs are brown in the Land of Women; however, sometimes women will stop at the stop signs and sometimes they drive right through. Obviously the women must be terrible drivers. Likewise, the women notice the men have an annoying habit of always stopping at yield signs.

Similarly, speakers of different languages compartmentalize the sounds they hear in words into different categories. For instance, in English the words 'toe' and 'so' are distinguished by their initial consonants: 'toe' begins with the sound /t/ while 'so' begins with /s/. However, many speakers of the language Tok Pisin do not differentiate between these sounds, and they may be interchanged without changing the meaning of words (e.g. [tupu] or [supu] for the word tupu, meaning 'soup'). Thus knowing how languages classify sounds is at least as important as knowing what sounds they use in the first place.

We can speak of a language's phonology as being how it carves up the acoustic space into meaningful units. This is an area of study practiced by phonologists.



The basic unit of study of phonology is the phoneme, which may be defined as sets of phones which function as one unit in a language, and provide contrast between different words. In other words, a phoneme is a category that speakers of a language put certain sounds into. For instance, returning to the Tok Pisin example above, the sounds [s] and [t] would both belong to the phoneme /t/. (In the IPA, phonemes are conventionally enclosed in forward slashes //.)

As another example, try pronouncing the English words keys and schools carefully, paying close attention to the variety of [k] in each. You should find that in the first there is a noticeable puff of air (aspiration), while in the second it is absent. These words may be written more precisely phonetically as [kʰiz] and [skulz]. However, since aspiration never changes the meaning of a word, both of these sounds belong to the phoneme /k/, and so the phonetic representations of these words are /kiz/ and /skulz/.

It should be evident why it is appropriate to refer to the phoneme as a level of abstraction away from the phone. We have removed a layer of information which, while interesting in itself, does not interact in many aspects of a language.

The phonemic inventory of a language is the collection of phonemes in a language. We looked at English's in the last chapter.



Two phones are called allophones if they belong to the same phoneme. For instance, in Tok Pisin [t] and [s] are allophones of /t/, and in English [k] and [kʰ] are allophones of /k/.

Allophones are often conditioned by their environment, meaning that one can figure out which allophone is used based on context. For example, in most varieties of American English, the English phoneme /t/ is realized as a tap [ɾ] between vowels in normal speech when not preceding a stressed vowel, for example in the word "butter". In a case like this we can say that the plosive [t] and tap [ɾ] allophones of the phoneme /t/ are in complementary distribution, as every environment selects for either one or the other, and the allophones themselves may be referred to as complementary allophones. Similarly [k] and [kʰ] are in complementary distribution, as [k] mainly occurs in the sequence /sk/, while [kʰ] occurs elsewhere.

By contrast, allophones may sometimes co-occur in the same environment, in which case they are in free variation. For example, the English word cat‍'s word-final /t/ phoneme may be realized either with an audible release, or as the tongue held in the gesture without being released. These phones, notated as [t] and [t̚] in the IPA, are free variants, as either is allowed to occur in the same position. Similarly [s] and [t] are free variants for some speakers of Tok Pisin.

Minimal pairs


An important question which may have occurred to you already is: how can we tell what is a phoneme? One of the most robust tools for examining phonemes is the minimal pair. A minimal pair is a pair of words which differ only in one segment. For example, the English words do /du/, too /tu/, you /ju/, moo /mu/ all form minimal pairs with each other. In a minimal pair one can be sure that the difference between the words is phonemic in nature, because the segments in question are surrounded by the same environment and thus cannot be allophones of each other. In other words, they are in contrastive distribution.

This is not a foolproof tool. In some cases it may by chance be impossible to find a minimal pair for two phonemes even though they clearly contrast. In many cases it is possible to find near-minimal pairs, where the words are so similar that it is unlikely that any environment is conditioning an allophone.

Finally this also requires some common sense, since phonemes may be in complementary distribution without being likely allophones. For instance, the English phonemes /h/ and /ŋ/ (both occurring in the word hung /hʌŋ/) can never occur in the same environment, as /h/ is always syllable-initial and /ŋ/ always syllable-final. However few would suggest that these phonemes are allophones. Since English speakers never confuse them, they are auditorily quite different, and substituting one for another in a word would render it unintelligible. Unfortunately there is no hard-and-fast consensus on precisely how to be sure sounds are allophones or not, and in many languages there is vigorous debate.

Phonological Rules




Phonotactics are the rules that govern how phonemes can be arranged. Look at the following lists of made-up words:

  • Pfilg
  • Dchbin
  • Riaubg
  • Streelling
  • Mard
  • Droib

The first three are 'unpronounceable' because they violate English's phonotatic constraints: 'pf' and 'dchb' aren't allowed at the start of a syllable, while 'bg' isn't allowed at the end. The next three are nonsensical words, but they do not violate phonotactics, so they have an 'English-like' feel. Lewis Carroll was particularly skilled in the art of creating such words. Some of his creations were immortalised in his poem Jabberwocky. Here are a couple of stanzas from his famed work:

'Twas brillig, and the slithy toves

Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"

Note that different languages have different phonotactics. The Czech Republic has cities like Brno and Plzeň, while the Mandarin for Amsterdam is Amusitedan. Czech phonotactics allow for really complicated consonant clusters, while Mandarin allows for none.

Coarticulation Effects




Morphophonology (or morphophonemics) looks at how morphology (the structure of words) interacts with phonology. In morphophonology one may talk about underlying or morpho-phonemic representations of words, which is a level of abstraction beneath the phonemic level. To see how this follows from the definition of morphophonology, it is necessary to look at an example. Compare the Biloxi words:

  • de 'he goes' - da 'don't go'
  • ande 'he is' - anda 'be!'
  • ide 'it falls' - ide 'fall!'
  • da 'he gathers' - da 'gather!'

Some also use this approach to deal with cases of neutralization and underspecification. Compare the Turkish words:

  • et 'meat'
    • eti 'his meat'
  • et 'to do'
    • edi 'he does'

Similar patterns in other words in Turkish show that while final stops are always devoiced, some will always voice when followed by a vowel added by suffixing, while the others always stay voiceless. Phonemically both ets must be represented as /et/, because phonemes are defined as the smallest units that may make words contrast (be distinguishable), so if we said the word for 'to do' was phonemically /ed/ then the two words would have to contrast! Still, we would like to say that on a more abstract level the word for 'to do' ends in a different segment, which doesn't surface (be realized) in some positions. The level of abstraction above the phoneme is known as an underlying or morpho-phonemic representation, and as is conventional we will indicate it here with pipes ||.[1] Underlyingly, these Turkish words may be represented as |et|, |eti|, |ed|, and |edi|, and in the same way other Turkish words with this type of voicing alternation underlyingly end in a voiced stop, which surfaces as a voiceless phoneme when word-final.

The parallelism between the morpho-phonemic layer and the phonemic layer should be clear. Just like how phonemes surface as phones conditioned by their environment, underlying segments surface as phonemes. The important difference is that the surfacing of morpho-phonemic segments as phonemes occurs after morphological processes (e.g. adding endings on to words) take place. In a sense, morphophonology is morphologically informed, while plain phonology isn't.



In some theoretical frameworks of speech (such as phonetics and phonology for applied linguistics and language teaching or speech therapy), it is convenient to break up a language's sounds into categorical sounds—that is, sound types called 'phonemes'. The construct of the phoneme, however, is largely a phonological concern in that it is supposed to model and refer to a transcendental entity that superstructurally and/or psychologically sits over the phonetic realizations and common variations of a sound in a language.

For example, if the English phoneme /l/ is posited to subsist, it might be said to do so because the /l/ of 'light' creates a clear contrast with a phonetically similar sounding word, such as 'right' or 'write' (both of which have a distinct /r/ at the beginning instead of a distinct /l/). Thus, 'light' and 'write' are a 'minimal pair' illustrating that, in English at least, phonemic /l/ and phonemic /r/ are distinct sound categories, and that such a distinction holds for realized speech.

Such a model has the profound weakness of circular logic: phonemes are used to delimit the semantic realm of language (lexical or higher level meaning), but semantic means (minimal pairs of words, such as 'light' vs. 'right' or 'pay' vs. 'bay') are then used to define the phonological realm. Moreover, if phonemes and minimal pairs were such a precise tool, why would they result in such large variations of the sound inventories of languages (such as anywhere from 38–50 phonemes for counts of English)? Also, it is the case that most words (regardless of homophones like 'right' and 'write', or minimal pairs like 'right' and 'light') differentiate meaning on much more information than a contrast between two sounds.

The phoneme is really a structuralist and/or psycholinguistic category belonging to phonology that is supposed to subsist ideally over common variations (called 'allophones') but be realized in such ways as the so-called 'clear' [l] at the beginning of a word like 'like' but also as the so-called 'dark' [l] at the end of a word like 'feel'.

Such concerns are really largely outside of the realm of phonetics because structuralist and/or psycholinguistic categories are really about cognitive and mentalist aspects of language processing and acquisition. In other words, the phoneme may (or may not) be a reality of phonology; it is in no way an actual physical part of realized speech in the vocal tract. Realized speech is highly co-articulated, displays movement and spreads aspects of sounds over entire syllables and words. It is convenient to think of speech as a succession of segments (which may or may not coincide closely with phonemes, ideal segments) in order to capture it for discussion in written discourse, but actual phonetic analysis of speech confounds such a model. It should be pointed out, however, that if we wish to set down a representation of dynamic, complex speech into static writing, constructs like phonemes are very convenient fictions to indicate what we are trying to set down (alternative units in order to capture language in written form, though, include the syllable and the word).

Workbook section


Exercise 1: Kalaallisut


Kalaallisut, or Greenlandic, is an Eskimo-Aleut language spoken by most of the population of Greenland, and has more speakers than all other Eskimo-Aleut languages combined. While Kalaallisut is currently written using five vowel letters, it is analyzed as having only three underlying vowel phonemes. From the following words, deduce Kalaallisut's phonemic vowel inventory and what conditions the allophonic vowels:

  • assaat - forearm
  • assoqquppaa - goes windward of it
  • assoruuppoq - pulls himself together
  • ilisimannippoq - has knowledge of something
  • isuma - mind
  • kikkut - which, whom, whose (pl.)
  • mulequt - leaf
  • nukarlersaat - the youngest of them
  • nuliariipput - they are married
  • orsuut - blubber
  • paamaarpoq - is slow
  • paaq - soot
  • qinnilinnik piiaat - screwdriver
  • sakiak - rib
  • terlippoq - is safe
  • uagut - we
  • utoqqaq - old
  • uffarvik - bathtub
  • ullortuvoq - the day is long
  • versi - verse

(Words taken from this Greenlandic English Dictionary.)


  1. Other conventions, such as double pipes || || or double slashes // // may also be seen.