Pronunciation Goals, Part I

Minimal Pairs in Vietnamese

7 min readFeb 12, 2021

Anyone who’s taught or learnt a language should be familiar with the concept of minimal pairs. They’re words that are identical in all aspects of pronunciation bar one, and are commonly used for learning to differentiate, first in terms of listening and then speaking, the nuances of pronunciation that exist in a target language which may not be present in the native language of the learner.

An illustrative example for English might be the words beat and bit. While the difference between the long e and short i phonemes seems clear enough to a native English speaker, the two sounds are almost indistinguishable to most native Chinese speakers that I’ve previously taught (and 100% of Vietnamese speakers from my wide-ranging sample of 1 tested so far).

Another example, this time useful for English speakers learning Mandarin, might be the difference between sha and xia (or 沙 and 虾), which mean shrimp and sand respectively. Certainly this was a distinction I struggled to pick up when I first started studying Chinese, albeit quite an easy one to learn to differentiate.

The sha/xia example above is a useful tool for isolating the retroflex sh- (ʂ) sound which is significantly different from the English sh- (ʃ). (渣 - 家 and 插 - 掐 would be equivalent minimal pairs for the other two retroflex sounds, zh- and ch- [ignoring r].)

Minimal pairs, with the exception of the example above, isn’t a technique I really utilised when learning Mandarin; for all the fear that tone invokes, the other aspects of Mandarin pronunciation really aren’t that daunting once you actually encounter them. Vietnamese is, well, different.

Vietnamese has far more alien sounds, ready to overwhelm and confuse the native English speaker, and I’m hoping that regular minimal pair drills will help me finally get my tongue, and equally importantly my ears, in tune with the subtle nuances that have so far caused me so much pain. (Maybe soon I’ll be able to tell the difference between evening and hungry without needing to resort to context.)

As such, I’ve decided to make a preliminary list of all the sounds that I think I may need to focus on. This list isn’t exhaustive, I will certainly miss things that I’m not yet aware of. I’ll also probably include things that turn out to be easily distinguishable as soon as I start practicing.

Once I’ve come up with my list of minimal pairs, I should be able to enlist the help of my wife to practice them (if she’s feeling cooperative). I don’t doubt that there are a whole host of video and audio resources available on the web that I could use if I was learning independently, it’s just that I’m lucky enough to have a live-in language consultant on hand.

Now, I want to talk a little bit about each pair of sounds here, hopefully providing some helpful insights to anyone else who’s just starting out, but I don’t really have enough space in one post for all the pairs on my list. As such, I’ve broken my list into two parts: consonants and vowels.

In this part I’m going to be listing minimal pairs for consonants, both initial and final, and hopefully I can add a second post looking at vowels at a later date. I’m going to start with the initial consonants, because there are more of them and I feel like they’re more interesting to talk about.

Initial Consonants:

t and đ: technically, the first of these is a voiceless alveolar stop while the second is a voiced alveolar implosive. In practice, they both sound like the letter d (in English) to me, although I don’t think either is identical to it.

The implosive part of the đ means you more or less gulp while saying it and the voiced part means your vocal chords vibrate when you articulate it (not that I have any idea how knowing that is supposed to help you actually do it). My general impression is that đ sounds more forceful and markedly “alien”, while t sounds softer. My pairs here are tôi - đôi and tới - đới.

kh and h: these are a voiceless velar fricative and voiceless glottal fricative, respectively. The kh sound is like a throatier h, such as at the end of loch. Perhaps surprisingly, the kh sound isn’t the one that I seem to struggle with, as according to my wife it’s the softer h sound that I usually end up getting wrong. My pair here is không and hông.

*After writing this I also discovered that it’s the t above that I struggle to pronounce clearly, and not the more alien sounding đ.

tr and ch: I can’t find much of a consensus on this one; while a large part of the internet seems to argue that these are archaic remnants of two different phonemes that have already converged, my wife, backed up by others on the internet, is very much of the opinion that this difference is extant, although, as a caveat, she adds that in regular speech she can’t necessarily be bothered to differentiate between the two.

*This may be one of those things where native speakers aren’t always the best positioned to understand the subtle nuances of their own languages, and it may well turn out they’re actually the same sound for all intents and purposes. This happens with many native Mandarin speakers who insist the third tone is pronounced with a marked dip then rise in regular speech when in practice it’s only pronounced that way when speaking deliberately slowly. (I’m sure there are equally frustrating things that native English speakers insist are true but aren’t, but as a native English speaker I’m in no position to say.)

The more I read about the two the less I understand, so I’m just going to try going by ear alone and not worrying too much if I can’t get it for now. My pairs are tranh - chanh and trẻ - chẻ.

g and c\k: this one’s actually quite easy to pick up but I still feel I need some practice. The g sound is a voiced velar fricative, which contrasts with the g sound in English (represented by c or k in Vietnamese) which is a stop. What that means, more or less, is that it’s pronounced much like we would pronounce a g in English but without completely closing off the flow of air through the mouth, leading to a throaty, almost choking sound that is rather uncomfortable and extremely unnatural for a native English speaker to pronounce. For this one, an example could be gây and cây.

ng and nh: as above, this one isn’t actually all that hard after a bit of practice. While ng doesn’t occur as an initial in English, it does appear quite commonly as a final, and so doesn’t take much effort, conceptually, to convert into an initial (just a little bit of muscle memory). I don’t think nh exists as either an initial or a final, but you can find it (or at least an approximation of it) in the transition between syllables in a word like Spaniard. One possible pair might be ngà and nhà.

Final Consonants:

c\k, t and p: I’m not fully sure how to go about explaining these, they’re technically voiceless stops which either have no audible release or only a short nasal release; essentially, you swallow the sound. Imagine the p at the end of the English word tap, for example. The p sound requires you to block airflow through your mouth by closing your lips, after which you release it, to form a lightly audible p sound.

Many speakers of Mandarin, whose only final consonants are nasal in nature, struggle with this when first learning English and overemphasise the final release, turning tap into something closer to tap-puh. Well, Vietnamese goes the other way; the final release is completely inaudible. You close off the air flow in the same way you would in English, but then you’re done, that’s it, end of sound.

The upshot of this is that while pronouncing these sounds isn’t all that difficult, discerning them when someone else is speaking can be a little tricky; the bulk of the sound we’re used to hearing in our own language is completely cut out. My pairs here are sắc, sắp, and sắt.

ng and nh: I’m yet to fully get my head around these so I can’t go into too much detail, but the nh ending appears to be identical to the ng ending, which is very much the same as the English ng ending in a word such as sing. As far as I can tell there are two different rules to be aware of.

One rule is that certain vowels, or combinations of vowels, can only be followed by one of the two endings (nh or ng), meaning no minimal pairs exist and there isn’t anything to work on. The other is that when following an a the sound of the preceding vowel changes based on the ending used (which itself is identical in both cases).

As a result, the key difference between trang and tranh (my minimal pairs for this example) is not the ending itself , but the way in which the vowel a is pronounced.

ng and m: as with above, I really don’t know what’s going on with these two, but to my ear, when following certain sounds (such as the vowel ô), these two endings can sound very similar indeed. I need to look into these finals a bit more, and practice them, before I can really say what’s going on, or whether I’m just imagining similarities that don’t really exist. My final pairs are hôm and hông.




A blog about life, love, language, literature and lüyou in Shanghai, China and beyond. I’m a student, a translator, a husband, a human, or at least I try to be.