You will need the necessary fonts to see Khmer.
There are some important rules to read Khmer. First and foremost, let us go over the phonology. This represents the standard Khmer language.
Khmer has eighteen consonants, including aspirated, unaspirated and voiced consonants. Aspiration is a puff of air that escape when you pronounce a consonant, like the 't' in take. Unaspirated voiceless consonants only exist in English when they follow an 's'. Compare the 't' in stop versus the 't' in take, the 'p' in spill versus pit. In Khmer, consonants are never aspirated at the end of a syllable. Only ɓ and ɗ are voiced in Khmer, which means you have to vibrate your vocal cords. However ɓ and ɗ are devoiced at the end of syllables and when they are the first consonant in a consonant cluster. The charts are from wikipedia, which was surprisingly helpful.
Only p, t, c, k, ʔ, m, n, ɲ, ŋ, l, h, j, ʋ can occur at the final position, while /h/ and /ʋ/ become [ç] and [w] respectively.
The most difficult aspect of Khmer has to be its vowels. There are thirty distinct vowel phonemes, some are difficult to grasp since many western languages have nowhere near that amount. Short vowels can not occur in open syllables, they must be closed off by a consonant (a very important reading aide!).
Consonant clusters are abundant in Khmer. Many of the clusters are not permitted in English, but luckily many are. Aspiration is slight when a consonant occurs in a cluster, so the initial consonant of the word is often written with an aspirated consonant. This is beyond me, probably historical reasons.
There are eighty five possible combination of two consonants and three possible combination of three consonants.
Syllable structure in Khmer is rather rich. 'C' stands for consonants and 'V' for vowels. All native Khmer words are either one syllable, or two syllable via affixation. Stress is always on the last syllable. Because of this, Khmer undergoes sort of a schwa like effect when a two syllable word has 'r' as the second consonant of a cluster in the first syllable. The first syllable is reduced to a schwa.
Take notice on how consonant clusters are not permitted at the end of words, this is very important. Words are often written with two final consonants, you should only pronounce the last one unless a vowel follows. Three consonants in a row are often split in colloquial speech.
The Khmer Script
Consonants arranged in their traditional order.
ក ខ គ ឃ ង
ច ឆ ជ ឈ ញ
ដ ឋ ឌ ឍ ណ
ត ថ ទ ធ ន
ប ផ ព ភ ម
យ រ ល វ
ស ហ ឡ អ
kɑɑ, kʰɑɑ, kɔɔ, kʰɔɔ, ŋɔɔ
cɑɑ, cʰɑɑ, cɔɔ, cʰɔɔ, ɲɔɔ
ɗɑɑ, tʰɑɑ, ɗɔɔ, tʰɔɔ, nɑɑ
tɑɑ, tʰɑɑ, tɔɔ, tʰɔɔ, nɔɔ
ɓɑɑ, pʰɑɑ, pɔɔ, pʰɔɔ, mɔɔ
yɔɔ, rɔɔ, lɔɔ, vɔɔ
sɑɑ, hɑɑ, lɑɑ, ʔɑɑ
First Register (weak consonants)
These consonants do not change the values of the vowel diacritics attached to them. They all have an implied 'a' sound.
ក ខ ច ឆ ដ ឋ ណ ត ថ ប ផ ស ហ ឡ អ
Second Register (strong consonants)
These consonants change the values of the vowel diacritics attached to them. They all have an implied 'o' sound.
គ ឃ ង ជ ឈ ញ ឌ ឍ ទ ធ ន ព ភ ម យ រ ល វ
Foot consonants are subscript forms used in consonant clusters.
្ក ្ខ ្គ ្ឃ ្ង
្ច ្ឆ ្ជ ្ឈ ្ញ
្ដ ្ឋ ្ឌ ្ឍ ្ណ
្ត ្ថ ្ទ ្ធ ្ន
្ប ្ផ ្ព ្ភ ្ម
្យ ្រ ្ល ្វ
្ស ្ហ ្ឡ ្អ
Heres a video from youtube on their pronunciation.
These are the vowel diacritics arranged in their traditional order. They cannot occur alone. Depending on whether they follow the 'a' or 'o' consonants, their values will be changed. They are paired with the consonant អ because some browsers will not display them correctly alone.
អា អិ អី អឹ អឹ អុ អូ អួ អើ អឿ អៀ
អេ អែ អៃ អោ អៅ អុំ អំ អាំ អះ
អុះ អេះ អោះ អិះ
The last vowel, in blue, is rarely used and has the same pronunciation as អុះ in the second register.
Here is their first register pronunciation.
Here is their second register pronunciation.
As noted in the video above, three of the vowels don't change forms. They're highlighted in red.
These vowels are full fledged vowels. They can occur on their own, although they aren't much used and they have no sound values that can't be represented by the vowel diacritics. You should familiar yourself with it, but it won't be covered here. Here's a list.
Some of these diacritics are extremely important as they change the value of the inherent vowels. Some are optional since pronunciation can be deduced from context, but their use can help disambiguate reading.
http://img10.imageshack.us/img10/6630/k ... ritics.png
Khmer has perhaps the most complex orthographic rules for a phonetic system, but once the rules are grasped, there is a consistent correspondence between written and spoken form. The rules of standard Khmer will be applied to whatever follows.
Words without vowel diacritics
There are words where only consonants occur, since the consonant carries an implied vowel. They are pretty straightforward.
កក : kɑɑk = frozen
ដង : ɗɑɑŋ = handle
សម : sɑɑm = fork
បង : ɓɑɑŋ = older sibling
រក : rɔɔk = to look for
មក : mɔɔk = to come
យក : jɔɔk = to take
It's pretty straightforward, the sound of the vowel depends on the consonant's series.
តា : taː = grandfather
ទា : tiə = duck
បី : ɓəj = three
ពីរ : piː = two
ហូប : hoop = eat
រូប : ruup = image
You can figure out the rest.
The only irregular is អាំ, which has two different sounds when paired with the second series depending on the final consonant. It hardly ever occurs in a close syllable, but when it does its only with ŋ. Note that any vowel with ំ is nasalized, so there is an unwritten 'm' at the end of the syllable when its open.
Where the ending is ŋ
ទាំង : tɛəŋ = involving
When the ending is something other than ŋ
រាំ : rɔəm = to dance
We need to distinguish diacritics from vowel diacritics. Vowel diacritics changes the vowels, diacritics changes the quality of vowels or some other function that doesn't involves changing the vowel. Some diacritics have lost their distinction as diacritics and are treated as vowel diacritics, they are not included here.
The most important ones you'll ever encounter is the 'bandak', which means to take away. It shortens the inherent vowel of a consonant and occurs over the second consonant. The sound values depend on whether it occurs with an 'a' or an 'o' consonant, and which consonant ends a syllable. The Samyaok Sannha does the same thing, but its usage is restricted to mostly Sanskrit and Pali words and even then its not mandatory. The 'bandak' can also occur when the ា is attached to the consonant.
The first series is pretty simple, it shortens the vowels regularly regardless of the final consonant.
ចង់ : cɑŋ = want
បង់ : ɓɑŋ = wish
សក់ : sɑk = hair
ចាក់ : cak = pierce
ចាប់ : cap = catch
Second series with m or p as the last consonant
ឈប់ : cʰup = stop
គ្រប់ : krup = enough
Second series with anything other than m or p as the last consonant.
គាត់ : koət = he/she polite
One of the most hardest aspect of reading Khmer is the concept of vowel governance. Consonants are divided into two class, submissive and non submissive. Submissive consonants loses their influence over the vowel diacritics if they are paired with non submissive consonants. It's not quite that easy unfortunately, sound values also depend on the etymology of the word and it's length.
Submissive consonants consists of all sonorants, which includes m, n, y, r, l, v. Sometimes h is submissive. Discouraged? Don't be, the rules are fairly consistent depending on the the length of words. Let's start.
ង ញ ណ ន យ រ ល វ
One syllable word with a consonant cluster initial.
In a one syllable word with a consonant cluster initial, the second consonant determines the value of the vowel. If the second consonant is submissive, it loses its dominance and the value of the vowel depends on the first consonant. An 'h' is never submissive in such words.
ត្រី : trəj = fish
ស្រាប់ : srap = for
ប្លែក : plaek = different
ម្ហូប : mhoop = food
ខ្វិន : kʰvən = crippled
ល្បង : lɓɑɑŋ = trial
Two syllable words
In words with two syllables, submissive consonants loses it's influence if it occurs in the second syllable following a non submissive consonant. An 'h' is submissive if it's the second consonant to any non submissive consonant or a 'v'. You will run into some irregularities.
វិហារ : vɨhiə = temple
ទាហាន : tiəhiən = soldier
កន្លែង : kɑnlaeŋ = place
របស់ : rɔɔbɑh = thing
អាវុធ : ʔɑɑvut = weapon
Polysyllables are less consistent. Submissive consonants are influenced by the closest non submissive consonant that it follows. If the first consonant is an ប or especially a អ, it often has no influence over the submissive consonants.
Predictability is high, but some words should just be learned. Sanskrit and Pali words have their own sets of rules, which I will get into later.