Pronunciator
The Moby Pronunciator II contains 177,267 words with corresponding pronunciations. The Project Gutenberg distribution also contains a copy of the cmudict v0.3. The file follows the format word pronunciation. The part-of-speech field is used to disambiguate 770 of the words which have differing pronunciations depending on their part-of-speech. For example for the words spelled close, the verb has the pronunciation /ˈkloʊz/, whereas the adjective is /ˈkloʊs/. The parts-of-speech have been assigned the following codes:
| Part-of-speech | Code |
|---|---|
| Noun | n |
| Verb | v |
| Adjective | aj |
| Adverb | av |
| Interjection | interj |
Following this is the pronunciation. Several special symbols are present:
| Symbol | Meaning |
|---|---|
| / | Used to separate phonemes |
| _ | Used to separate words |
| ' | Primary stress on the following syllable |
| , | Secondary stress on the following syllable |
The rest of the symbols are used to represent IPA characters, according to the following table:
| Symbol | IPA |
|---|---|
| & | æ |
| - | ə |
| @ | ʌ, ə |
| @r | ɜr, ər |
| A | ɑː |
| aI | aɪ |
| Ar | ɑr |
| AU | aʊ |
| b | b |
| d | d |
| D | ð |
| dZ | dʒ |
| E | ɛ |
| eI | eɪ |
| f | f |
| g | ɡ |
| h | h |
| hw | hw |
| i | iː |
| I | ɪ |
| j | j |
| k | k |
| l | l |
| m | m |
| n | n |
| N | ŋ |
| O | ɔː |
| Oi | ɔɪ |
| oU | oʊ |
| p | p |
| r | r |
| s | s |
| S | ʃ |
| t | t |
| T | θ |
| tS | tʃ |
| u | uː |
| U | ʊ |
| v | v |
| w | w |
| z | z |
| Z | ʒ |
Read more about this topic: Moby Project