Cantonese soundsEdit


There are far fewer syllables in the Cantonese language than English. The syllables are also easily described with the concepts of initials and finals. A syllable begins with a single consonant. This is called the initial. The rest of the syllable is called the final. A final can have a single vowel or a diphthong (two vowels that glide from one to the other) and an optional final consonant (p, t, k, m, n, or ng).

The pronunication guide below is based on American English except where otherwise noted. Not all sounds can be described with English words and some are just approximations at best. Be sure to listen to actual speakers to ensure that your pronunciation is correct. Use the Syllabary to hear recordings of a native speaker.


Yale Pronunciation
b b in "ball"
p p as in "pat"
m m as in "mom"
f f as in "foot"
d d in "dog"
t t as in "top"
n n as in "not"
l l as in "lap"
g g in "good"
k k as in "kite"
ng ng as in "singer"
h h as in "hot"
j Blend of the ds in "beds" and the j in "jam"
ch Blend of the ts in "cats" and the ch in "church"
s s as in "sun"
gw gu as in "penguin"
kw qu as in "quart"
y y as in "yard"
w w as in "want"


Yale Pronunciation
aa a as in "spa"
aai igh as in "sigh"
aau ow as in "how"
aam am as in "Vietnam"
aan on as in "con"
aang ong as in "tongs"
aap op as in "top"
aat ot as in "pot"
aak ock as in "sock"
ai i as in "kite"
au ol as in "color"
am ome as in "some"
an un as in "sun"
ang ung as in "lung"
ap up as in "cup"
at ut as in "cut"
ak uck as in "luck"
e e as in "bet"
ei ay as in "say"
em em as in "temple"
eng ang as in "angry"
ek eck as in "peck"
i ee as in "tee"
iu ew as in "few"
im eem as in "seem"
in een as in "seen"
ing ing as in "sing"
ip eep as in "sleep"
it eet as in "meet"
ik ick as in "sick"
o or as in "or" (British English)
oi oy as in "boy"
ou o as in "no"
on on as in "con" (British English)
ong ong as in "song"
ot ot as in "hot" (British English)
ok ock as in "sock
u oo as in "too"
ui ooey as in "gooey"
un oon as in "soon"
ung combination of ou and ng
ut oot as in "boot"
uk ook as in "took"
eu er as in "her" (British English, with rounded lips)
eung combination of eu and ng
euk ork as in "work" (British English)
eui eui as in "deuil" (French)
eun ine as in "engine"
eut ut as in "put"
yu u as in "tu" (French)
yun un as in "union"
yut Ut as in "Utah"
m mm as in "hmm"
ng ng as in "sing"

Tips on finalsEdit

  • The final consonants p, t, and k are unreleased. This means that they are virtually silent and you hear no "puff of air" at the end of the syllable. To give a concrete example, say the word "cup" and do not open your lips at the end of the word. Note how there is no "puff of air" at the end. The k sound will also shift from being what is termed a velar stop to a glottal stop when it is used to add liveliness to final modal particles. The final consonant k will sometimes disappear in rapid speech as in the expression m4 sai2 haa[k]3 hei3.
  • The aa sounds are a low back vowel which is slightly longer in length and different in quality from the a sounds. Be sure to note the difference in these sounds since confusing the two will change the meaning of words.
  • The vowel quality in ing, it, and ik is not the same as in in,im, or i. It's the same difference between the English words "sin" and "seen" (or in grammar school terms a "short" vowel versus a "long" vowel). While this difference is not as important as the one above since it does not contrast word meanings, you will have a much more obvious "foreign accent" if you do not master these two sounds.
  • The yu sound does not exist in English but it is not hard to produce. Start by saying a long i as in "see" and--without changing anything else!--round your lips. It's a common sound in French so "think French" if you have to.
  • The eu sound does not exist in English either but like the yu it's just a case of rounding the lips. Start by saying the e sound in "bet" and--without changing anything else!--round your lips.
  • The eui sound is simply a fusion of eu and i ("eu-ee") into a single syllable.
  • The o sound does not exist in American English, but it is in British English. It is the back rounded vowel that you hear when British people say "more" or "scorn". If you listen carefully to a British speaker, you'll notice they do not pronounce the "r" in these words. It is the quality of the "o" vowel that makes them unique to American speakers' ears.
  • The final eung has a faint, British r sound riding on the vowel, so the number two would be pronounced somewhat like "leurng". Be careful -- pronouncing this sound with an American r is a common mistake that sounds extremely foreign.
  • Only the finals m and ng can be used as standalone nasal syllables.
  • Vowels preceding nasal final consonants are not nasalized, yet vowels following nasal consonants are. You can hold your nose while pronouncing some syllables such as sin1 or ngo5 to test your pronunciation. It's not critical to get this right, but doing so will reduce any apparent foreign accent.



All the Chinese languages are tonal. This is usually one of the biggest challenges for English speakers to overcome since English does not use tones to distinguish meanings. But notice English does use stress to distinguish meanings, try for yourself: what does 'pro.test mean versus pro.'test?

There is a common misconception that tones requires a musical ear or that you must use a particular pitch. In fact, tones are always relative. For instance, if a tone is "high falling" then the key is to start on a pitch that is in the upper range of your normal voice and allow the pitch to drop to the lower range of your normal voice so that you end on a pitch lower than where you started. Like any pronunciation in any language, success lies in a lot of listening and practicing. You can and will eventually master them if you conscientiously work on them.

Tone categories and numbersEdit

Yale Description Alternative Notation Start to End Pitch
ā, à high level or high falling 1 55 or 53
á mid rising 2 35
a mid level 3 33
àh low falling 4 21
áh low rising 5 23
ah low level 6 22

For the Start/End Pitch, the two numbers listed above are from one to five with one being the lowest pitch and five being the highest for the tone course. Cantonese school children typically recite the tones in the order listed above: 1, 2, 3, 4, 5, 6, with tone 1 being high falling. For historical reasons, high level is not recited. In some people, low falling is considerable and a "creakiness" can be heard in the pronunciation.

Tips on tonesEdit

  • The syllables that end with p, t, and k usually are considered only to have "checked" level tones. Because these syllables are so short, a rise or fall in the pitch usually cannot be detected. Sometimes the pitch actually changes depending on the tones of the words before it, as in the word for "dish", dip6 versus dip2. Both can be correct, and you will notice many other low falling or low level words shift to a mid rising tone.
  • A great way to practice tones is to listen to sound files of vocabulary words and guess the tones without looking at the romanization. The better you can recognize the tones, the better you can reproduce them in your own speech.
  • As you learn more words, you will automatically be able to guess some close tones (e.g., maai5 versus maai6) on many syllables in context, so don't be too concerned if you're having trouble at first. Patience is the key!

Yale RomanizationEdit


  1. No spaces between syllables that make up a single word. If the break between syllables becomes unclear you can use an apostrophe, e.g. 二奶 (yih'nāai).
  2. Tone mark always comes on first vowel. In the case of the words 唔 (̀mh) and 五 (̀ngh), the tone mark falls on the m and g.


廣東話 (Gwóngdūngwá)
香港 (Hēunggóng)
Hong Kong
中國 (Jūnggwok)

  Listen to an audio sample of a Cantonese sentence.

Transcript of the above sample: 我屋企有兩個大人,同埋一個細路。

Translation: In my home there are two adults and a kid.