In designing the phonology of your artificial language (henceforth AL), you have several choices:
1. Choose phonemes that you are most familiar with. 2. Choose phonemes that appeal to you aesthetically. 3. Choose phonemes to maximize your phonemic inventory. 4. Choose phonemes that most people in the world already know or can learn to pronounce easily. 5. Choose phonemes based on morphological, syntactic, semantic or other requirements.
Number 1 appears to be the choice in all self-claimed universal ALs. ALs that fall into this category are Esperanto, Glossa, Loglan/Lojban, Ido, Intal and many others. In all of these, the designers started with the sounds they were familiar with, eliminated one or two that they thought were not very common, and sometimes added one or two others to flesh things out.
Number 2 is the choice of people who design ALs just for the fun of it, or for personal use. People in this category are not likely to find this essay very interesting. :-(
Number 3 was the choice of the characters who designed the super language in Robert A. Heinlein's short story, Gulf. Aside from Heinlein's story, and to the best of my knowledge, no AL has ever been designed with a maximal phonemic inventory.
As for number 4, to my knowledge, no serious AL has ever been designed to make pronunciation as easy as possible for as many people as possible (contradictory claims notwithstanding). I believe that this is the case simply because most AL designers have very little knowledge of the phonologies of natural languages other than their own.
Number 5 covers any requirement that is not covered by numbers 1-4. In effect, it is a none-of-the-above choice. I'll have more to say on this one later.
For obvious reasons, there's not much more that I can say about numbers
1 and 2, since they are based on what is essentially personal
preference. However, a lot more can be said about the remaining
choices, which I will attempt to do later. First, though, I think it
would be a good idea to look at what's available to us; that is, how
large is the set of possible phonemes that we can choose from to create
the phonemic inventory of an AL?
[For those of you who are not using a browser that can display a PNG file, follow this link for a verbal description of the chart.]
The information in the chart was obtained from several sources. However, the most important source was "The World's Major Languages" edited by Bernard Comrie. (Incidentally, if I had known in advance how much time and effort would go into the making of this chart, I never would have started it. From hindsight, though, I think it was worth it. I hope you agree.)
Before wading into the thick of things, first let me explain what's wrong with the chart. When I first started putting it together, I had planned to include all of the phonemes for the listed languages. As I proceeded, however, I realized that to do so would require a VERY BIG CHART, much bigger than I could fit on a single page. So, as any other normal, sane, intelligent and good-looking person would do, I compromised. Basically, what I did was eliminate phonemic feature distinctions that were made by only a small minority of languages. Aspiration went first, then nasalization, then labialization, etc. The whole gory excision is explained in the footnote at the bottom of the chart. Even so, I was not entirely consistent, as any astute observer will surely notice (for example, /palatalized-n/ and /palatalized-l/ survived). To add insult to injury, I combined some phonemes that were articulated in slightly different positions and/or which sounded so similar that most people would not be able to tell them apart (boxes labelled "X or Y" fall into this category). Some of the vowel allocations may raise a few eyebrows (and perhaps a few hackles), but then who ever agrees about vowels? Finally, you'll note that most of the IPA symbols used do not reflect the latest version of the IPA standard. The reason is that my brain-damaged IPA font was missing a few of the newer symbols but has all of the older ones, and I decided to be consistent. Besides, almost all of my sources used the older symbols, and I suspect that most linguists continue to use the older symbols out of habit. Fortunately, only a few have been changed, and there is no chance of confusion.
Next, let me explain some things that are not so embarrassing. Chinese (Man) means Mandarin Chinese, Portuguese (Eu) means European Portuguese (as opposed to Brazilian or African dialects of Portuguese), Spanish (Cas) means Castilian Spanish, and Vietnamese (S) means South Vietnamese. For the last three consonants, the plus sign "+" indicates co-articulation.
In spite of the fact that professional phonologists may shudder at the result, I do feel that the chart is adequate for use in designing an artificial language.
The next step is to decide how to use the chart. This really depends
on whether you are trying to design an AL that is maximally concise,
maximally pronounceable or maximally "something-else".
Obviously, the objective here is to have as many phonemes as possible. Since I would never want to actually learn a language like this, I'm not going to create what I would consider an optimal maximum phonemic inventory - I'll leave that up to someone who is more biased in favor of this whole approach. Instead, I'll just mention a few things that may be of interest to such a designer.
There are two general ways of increasing the number of vowels: nasalization and tones. Nasalization is either on or off, and can only double the number of vowels. Tones, however, can multiply the number of vowels by the number of tones. Thus, if you have four ones, then you'll have four times as many vowels. If you do choose to use tones, keep them short. For example, if you decide to base your tonal system on Mandarin Chinese (which has four tones), make sure to shorten the low tone, since it almost doubles the length of the syllable. You can increase the number of tones to six (as in Vietnamese) or seven (as in Cantonese), but your language will become quite difficult to master.
As for consonants, you'll probably want to start with most of the unmarked phonemes in the chart. Next, you might try making phonemic distinctions using aspiration and velarization (also called emphasis or glottalization) as I mentioned above. I do not think that palatalization will be productive if you also have the phoneme /j/. (Although /palatalized-n/ is not the same as /nj/, they are too close for comfort. At any rate, I would not bother with palatalizing any consonants. I would make free use of /j/ instead.) Labialization and /w/ can be handled similarly. Some languages implode voiced stops but do not, to my knowledge, use implosion to make phonemic distinctions - give it a try! Ditto for retroflexing. You might even want to create some new co-articulated consonants. Gemination doesn't really gain you anything (this also applies to vowels), since you're doubling the length of the phoneme. Finally, you may want to consider the various clicks and pops used in some south African languages, although I did not list them in the chart. Whatever you end up with, please don't try to teach it to me. :-)
With polysyllabic morphemes, you can increase your mileage by making
distinctions based on stress. For example, the meaning of the English
word "present" depends on whether the stress is on the first syllable
(=a gift) or on the second syllable (=to introduce). Orthographically,
stress can be indicated with a diacritic (i.e., accent mark),
capitalization, or by some other means. For example, "SIMba" could mean
"cheese", while "simBA" could mean "river".
Most AL designers choose phonemes that they are familiar with.
Unfortunately, they do so not realizing how difficult some of their
choices may be for people of different linguistic backgrounds. So, the
goal here is to select a phonemic inventory that is easy to pronounce or
easy to learn for everyone. The tough part is deciding what the word
"easy" means. In the following paragraphs, I will discuss a few
approaches to the design of a phoneme inventory that is "easy" for as
many of the world's inhabitants as possible.
Letters written between slashes represent phonemes (minimal units of
sound). However, it is possible for a phoneme to have more than one
pronunciation. For example, the phoneme /s/ in the English words
cats and dogs is pronounced like "s" in cats
but like "z" in dogs. This sound difference is indicated by
enclosing the actual sound between square brackets. Thus, for example,
the /s/ in cats is pronounced [s], while the /s/ in
dogs is pronounced [z]. Phonemes with alternate pronunciations
are called allophones.
Letters written between slashes represent phonemes (minimal units of sound). However, it is possible for a phoneme to have more than one pronunciation. For example, the phoneme /s/ in the English words cats and dogs is pronounced like "s" in cats but like "z" in dogs. This sound difference is indicated by enclosing the actual sound between square brackets. Thus, for example, the /s/ in cats is pronounced [s], while the /s/ in dogs is pronounced [z]. Phonemes with alternate pronunciations are called allophones.
1. Easy for all - choose only those phonemes that appear in all 25 languages. Using this approach we can start with the following phonemes: /t/, /k/, /s/, /m/ and /u/, which appear in all 25 languages. Next, we can combine /i/ and /I/ into the single phoneme /i/, since they are very close. Similarly, /A/ and /a/ can be combined into /a/. The net result is 7 phonemes: /t/, /k/, /s/, /m/, /a/, /i/ and /u/. Thus, the number of morphemes with the form CVC will be 4x3x4 = 48. In a language with this phonemic inventory, words are likely to be quite long. However, as long as you don't use consonant clusters, your language will be extremely easy to pronounce no matter what a learner's native language is.
A possible extension to this approach would be to allow more than one pronunciation for a single phoneme, even if they are not similar. For example, you could use /r/ to represent the sounds [l], [r], [R], plus all the other rhotics. In other words, make them legal allophonic variants for /r/. Also, how about using /h/ to represent both [f] and [h]? Japanese does something very similar - /h/ represents the allophones [unvoiced bilabial fricative] and [h]. Finally, you can reduce the number of languages in the list from 25 to a smaller number, by eliminating those languages that you consider unimportant. For example, by eliminating Yoruba, you get /n/. By eliminating Japanese, you get /l/. And so forth. This process can also be used in the remaining "easy" approaches discussed below, and I will say no more about it.
2. Share the pain - start with "super easy" and add phonemes in a way that causes the learning burden to be shared equally. In other words, make it a requirement that no speakers will have to learn more than N new phonemes, where N is very small (1, 2 or 3). I played around with this for a while and came to the conclusion that you can add at least N and perhaps N+1 new consonants to the "super easy" inventory. If you look at all possibilities, this could be a time-consuming process, more suited to a computer. As far as I'm concerned, the gain is not worth the pain.
3. Majority wins - choose phonemes that appear in X% of the 25 languages. In other words, all we are doing here is drawing a line: people on one side of the line will already know all the phonemes, while people on the other side will have to learn one or more new ones. Note that if X = 100%, then this is equivalent to the "super easy" case. Here's a list of what you get as the percentage gets smaller and smaller (please note that I'm using /sh/ for English "sh" in "ship", and /ch/ for English "ch" in "chip"):
Percentage Consonants Vowels ================================================================== 100% t k s m u 96% t k s l j m n i u a 92% t d k s l j m n i u o a 88% p b t d k s l j m n i u e o a 84% p b t d k f s l j m n no change 80% no change no change 76% p b t d k g f s l j m n no change 72% p b t d k g f s sh l j m n r no change 68% p b t d k g f s z sh h ch l j m n r no change 64% p b t d k g f v s z sh h ch l j m n r no change
And so forth. Keep in mind that /r/ is an alveolar flap or trill - it
is not like the English or Chinese retroflexes, which are quite
different. However, if you let /r/ represent any of the rhotics [r],
[R], etc., then it will move to the 96% line. In the same vein, if you
let /h/ represent [h] or [x], then it will move to the 88% line.
As another example, if the semantics of your AL requires a three-way distinction and you don't want to add a syllable or create a difficult-to-pronounce consonant cluster, you could use the semivowels /j/ and /w/. Thus, you could make three-way distinctions such as /ka/, /kja/ and /kwa/.
And so on. Obviously, in situations like the above, you will have
decided that some requirements are more important than others. Thus, in
both of the above examples, ease of pronunciation was less important
than other requirements.