Phonotactic constraints

Tom eats nutsten
Tom nuts eatstne
nuts eats Tomnet
nuts Tom eatsnte
eats Tom nutsetn
eats nuts Toment

Just as some combinations of words are possible sentences, while others are not, some combinations of sounds are possible words, others are not. Three words or three sounds can combine in six different ways as shown here. Possible combinations are shown in green, impossible ones in pink. Here we are only concerned with the column on the right. Native speakers of English know not only that there does not happen to be a word tne, but also that there could not be such a word in English, since plosive+nasal clusters do not occur at the beginning of any word in this language. This would violate the phonotactic constraints of English.

Phonotactic constraints define what sound sequences are possible and what other sound sequences are not possible in a given language. These constrains are based on an examination of what sequences occur and what sequences do not occur in that language. This scheme works as far as patterns either really never occur or, if they do, they are very common. The problem is that some patterns occur, but are very rare, and it is not immediately clear whether we should take them to be allowed or disallowed by our phonotactic constraints. As an example, think of ʃw in English: this sound sequence occurs in compound words (like dishwasher dɪ́ʃwɔʃə), some names (like Gershwin gə́ːʃwɪn, Schwartzenegger ʃwóːtsənɛgə, or Schweppes ʃwɛ́ps) and a single one-morpheme common noun, schwa ʃwɑ́ː, which is a linguistic technical term. This is significant because neither morphologically complex words, nor names are very relevant in identifying phonotactic constraints, so there remains a single “real” representative of this cluster, the realness of which is jeopardized by its being unknown to very many speakers of English.

If we construe phonotactic constraints that allow ʃw, we will be faced with the question why there are so few instances of this cluster in English. The common practice is to not allow ʃw, and list it as an exception. The reason for doing so is our desire to formulate phonotactic constraints in terms of natural classes: we will see that these constraints involve not random sets of sounds, but groups that belong together because of their inherent properties (like place or manner of articulation or voicing).

Because phonotactic constraints are formulated in broad terms, they cannot predict exactly what sound strings are and what sound strings are not available in a language. There will be items sticking out in both directions. The following table shows this.


The green cells contain the obvious cases: words that could and do exist, like brick, and words that could not and do not exist, like bnick. The latter type is referred to as a systematic gap: this item is ruled out by the system which disallows any plosive+nasal cluster at the beginning of a word. The pink cells are both deviant: blick is a commonly cited word that could exist, since words can begin with bl and they can end in ɪk, but this one just happens not to exist.In fact, blick does exist in dictionaries, but is unknown to most speakers. For whom, then, it does not exist. This is an accidental gap. And we have already discussed the case of schwa, which is ruled out by phonotactic constraints, and yet is part of the vocabulary of a small subset of native speakers of English.Again, for those speakers, who do not know of this word, it does not exist.

constraints on phonotactic constraints

We have already touched upon two limitations on phonotactic constraints. One is that — at least in the case of English — phonotactic constraints work only within morphs. We have seen that ʃw is very rare within a morpheme, but there is no reason why this cluster should not occur across a morpheme boundary, where the first morph ends in ʃ and the following one begins with w (like in brushwood, dishwasher, freshwater, meshwork, etc). It is not only compound words that show this “freedom of combination”, but even nonsyllabic suffixes create clusters that do not occur within any single morph: eg loves ləvz, smoothed smʉwðd, washed wɔʃtThe status of ʃt is like that of ʃw: it is practically nonexistent within a morph. end in such clusters.

We have also seen that phonotactic constraints typically refer not to single segments, but to larger groups that belong together by some phonetic property, ie natural classes. Thus if we find that pw is not a possible initial cluster in English, it does not come as a surprise that bw and fw are also not possible initial clsuters, since p, b, and f are all labial consonants. We will see this tendency in the constraints detailed below.

Another important property of phonotactic constraints is that they usually affect neighbouring segments. Segments that are not adjacent rarely affect each other. When they do (like in vowel harmony, for example), the analysis always involves some special machinery. We can also observe that interaction between two neighbouring consonants or two neighbouring vowels is much more common than that between a vowel and the following consonant. Constraints between a consonant and the following vowel is very unusual.

When gauging the possibility of adjacent sounds, we must also take word boundaries into consideration. Some consonants do not occur at the beginning of a word, others do not occur at the end of a word. That is, we will not only have constraints that ban the adjacency of two consonants, but also other that ban the adjacency of a consonant before or after a word-boundary symbol.

the beginning of the word

Any vowel may occur word initially, although ʉ and ʉw are rare in this position compared to other vowels (oddles, ooze, Uber, umlaut, Uzbek are a few examples).

Of consonants ŋ does not occur at the beginning of words at all. This fact fits in well with the restricted distribution of the velar nasasl: there exist accents of English in which ŋ does contrast with n at all, but even in accents where it does it only occurs word finally (sing sɪŋ), before velar plosives (sink sɪŋk, finger fɪŋgə), and rarely before unstressed ə (gingham gɪŋəm). Word-initial ʒ is rare and ð only occurs in function words (eg this, thus, though).

Word-initial consonant clusters follow three patterns in English. Many of these clusters begin with s (marginally ʃ) with a plosive or a sonorant after them. It is not at all obvious if the plosive after s is “voiceless” or “voiced”. The more wide-spread — and probably mistaken — tradition is to take these to be “voiceless”, this is what we will do here. Accordingly, we have the following clusters: sp (spot), st (stop), stʃ (stew stʃʉw in BrE, but stʉw in AmE), sk (Scot), sm (smock), sn (snot), sl (slot), sw (swat), ʃr (shred). Other than ʃr, clusters beginning with ʃ are rare, we have already discussed ʃw, and others feature in mostly Yiddish ʃm schmooze, ʃn schnapps, ʃp spiel, ʃl schlep We can put these aside as “impossible”, but exisitng. Some speakers still have sj word initially (eg in suit sjʉwt), but most omit the j here (sʉwt).

Another group of two member word-initial consonant clusters are composed of an obstruent (here symbolized by T) followed by a sonorant (here symbolized by R). The set of obstruents occurring in this type of cluster is limited to plosives (excluding the two affricates, tʃ dʒ), ie p t k b d g and voiceless fricatives (excluding the two sibilants, s ʃ), ie f θ. The set of sonorants involved in these clusters excludes the nasals and j h — the first of which will be discussed below — so we are talking about l r w. The following chart shows all the possibilities, with an example for those that are allowed by phonotactic constraints.


There is a significant tendency that we can detect in these patterns: those clusters where the two members have the same place of articulation do not occur. We have concluded earlier that despite the minor phonetic differences in their place of articulation, the bilabial plosives (p b), the labiodental fricatives (f v), and the labiovelar glide (w) are all broadly labial. We are now corroborated in our earlier conclusion: p b f w all have the “same” place of articulation, they are homorganic. In the same way, the dental θ and the alveolar t d l are also homorganic, which explains why tl, dl, and θl do not occur. Since r does occur after each of these consonants, it seems that postalveolar/palatal consonants are not homorganic with dentals and alveolars, and, of course, they are not homorganic with labials and velars either. It is surprising then that gw appears to be missing: kw is a well-attested cluster and elsewhere we do not see any difference between the combinatorial possibilities of voiceless and voiced plosives.In fact gw does rarely occur: guacamole, Guam, Gwen, etc. Similarly, we can find words with pw (pueblo) or bw (bwana), but we exclude these clusters, like we have excluded ʃw.

Although j is a sonorant, just like l r w, we have excluded clusters with it above, because j behaves differently from the other sonorants. Its occurrence after other consonants is much less restricted. Cj clusters are subject to the ban on homorganicity, but the first member of these clusters may not only be an obstruent, but also a sonorant. There is some variation in which Cj clusters are possible and which are not, British English is currently undergoing a change in this respect, so we give two possibilities in the following chart introducing these clusters. The variant in the “old” column is becoming less common, the one in the “new” column is becoming more common in British English. We can see that j is stable after labials and velars and that it is stably absent after palatals. Alveolar/dental+j clusters are currently being simplified in British English, either losing the j (Thule, suit, Zeus, Luke, new) or merging it with the preceding plosive into a palatal affricate (tube, dune).

Three-member consonant clusters also occur word initially. The first in these is always s, while the second and third member of these clusters are also possible as two-member clusters. That is, skr (screw) or smj (smew) are possible word-initial clusters, since kr and mj are also possible, but spw or stl are not, just as pw and tl are not possible at the beginning of a word either. The inference does not work in the other direction: tw (twig) and, for some speakers, nj (new) are possible two-member clusters, but their three-member versions with s, stw and snj are not for all.

the end of the word

We have already seen that the occurrence of vowels at the end of words is restricted. It is a defining property of checked vowels — short vowels apart from unstressed ə — that they do not occur word finally. We only find unstressed ə, long monophthongs (ie R vowels), and diphthongs (ie free vowels) here.

In nonrhotic accents, like CUBE, r is also absent word finally, as is h in most varieties of English. Whether j and w occur in this position depends on our analysis of diphthongs. If they are taken to be vowel clusters, then these two glides are also excluded from word-final position. An increasing number of CUBE speakers also lack word final l, but they will then have w instead: tell tɛl > tɛw. Nasal consonants and obstruents all occur word finally, though ʒ is not common: ram, ran, rang, wrap, rat, ratch, rack, crab, bad, badge, back, caff, math, bass, ash, have, with, has, beige.

The most common word-final two-consonant cluster in English consists of a nasal (N) followed by an obstruent (T). While word-initial TR and Cj clusters could not be homorganic, word-final NC clusters are obligatorily homorganic. The following chart contains the available combinations.

hemp mp bent nt bench ntʃ bank ŋk
*mb bend nd sponge ndʒ *ŋg
nymph mf month pence ns (avalanche )
*mv * bronze nz (mélange )

There is no obstruent for the grey cells: no dental plosives or velar fricatives in English. The pink boxes show that mb ŋg mv nð are impossible word finally.Since we are used to seeing the spelled form of words, it may be surprising that jamb and jam are homonyms: dʒam. The clusters and are also not common, we only have recent loans to exemplify them.

Although nonhomorganic NT clusters do occur word finally, but they almostThere are some apparently monomorphemic counterexamples, like James dʒɛjmz or Thames tɛmz. never belong to the same morpheme: trimmed trɪm#d, songs sɔŋ#z.

The liquid l may also occur in the first position of a word-final consonant cluster. (r may not in a nonrhotic accent, like CUBE.) As opposed to NT clusters, the second member of an lC cluster may be a nasal, not only an obstruent. As already mentioned, there are more and more speakers of British English, who replace l by w in this position, thus losing lC clusters. The next chart contains the possible word-final lC clusters.

help lp belt lt belch ltʃ bulk lk
bulb lb held ld bulge ldʒ (Glenelg lg)
shelf lf filth else ls (Welsh )
twelve lv * Charles lz *
film lm kiln ln *

The cluster provides hardly any other example but Welsh, hence the parentheses around it. Glenelg (a village in Scotland) is even more marginal, probably we should claim that lg# is not grammatical.

The last type of two member word-final clusters is composed of two obstruents. These clusters are exceptionlessly voiceless and the fricative in them is typically s, with f in one cluster. The possibilities are shown in the next chart.

lisp sp list st risk sk lapse ps quartz ts fix ks
lift ft script pt fact kt

We see that either the second member of a TT cluster is the alveolar t or s, or the first member is the alveolar s — or exceptionally t in ts, which is not very common within a morpheme.It is across a morpheme boundary: cats kats. It is also noteworthy that TT clusters are available in both orders: sp and ps, sk and ks, st and ts. This is not found for other clusters.

If we take diphthongs to be vowel+consonant sequences, there are many types of CCC# clusters in English: eg kind jnd, post wst, etc. If, however, we analyse these as vowels, as in this course, there aren’t very many other three-member clusters left. The following occur: prompt prɔmpt, instinct ɪnstɪŋkt, glimpse glɪmps, lynx ŋks, and there are two clusters with one example each: sculpt skəlpt, mulct lkt.

constraints on sonorants

Just as h does not occur word finally, it also does not occur before a consonant in almost all current accents of English. In fact, even being before a vowel is not “enough” for h to “survive”: it is pronounced only before a stressed vowel (Manhattan manhátən) and at the beginning of a word (horizon hərɑ́jzən).

The distribution of r is well-known: in nonrhotic accents it occurs only before vowels (and sonorant consonants: barrel bárəl or bárl̩). There is not such restriction on the distribution of l, it cocurs freely before vowels and consonants, as well as word finally.

The distribution of j and w is similar to that of r. However, if diphthongs were analysed as vowel+consonant sequences, these two glides would be available in any position in a word: word finally (boy boj, now naw) and before a consonant (voice vojs, crowd krawd).

We saw that the distribution of ŋ is limited: it only occurs practically only before k, g, and word finally. The other two nasals also occur before nonhomorganic consonants (damsel dámzəl, convoy kɔ́nvoj) and vowels (map map, nap nap).

constraints between vowels and consonants

Earlier we have mentioned that phonotactic constraints between a vowel and a consonant are not as common as those between two consonants. We will not look at what there is.

The checked vowel ʉ itself is rathare rare, and it happens never to occur before , θ, or ð. Word finally we do not find or ɛð either, but these gaps probably are due to the overall rarity of ð.

We find more systematic constraints on long vowels and consonants following them. Three of the R vowels — the broad ones – occur freely before a word-final consonant (heart hɑːt, horse hoːs, hurt həːt), but the other two — the smooth ones — are not common in this position, though they do occur in a few words (weird wɪːd, scarce skɛːs). The only consonant before which these two vowels are common is rBut r, recall, does not occur word finally. (hero hɪːrəw, vary vɛːrɪj).

Broad vowels also occur before consonant clusters (past pɑːst, launch loːntʃ, excerpt ɛksəːpt), but smooth vowels never occur here.Except in morphologically complex words: pierced pɪːst, which we ignore in setting up phonotactic patterns.

Diphthongs generally occur freely before consonants with the exception of ŋ and r. We have seen earlier that the narrow diphthongs, ɛj and əw do not occur before r. Diphthongs occur before consonant clusters of the TR type (April ɛjprəl), and are rather common before NT and lC clusters provided that both consonants are coronal (kind kɑjnd, faint fɛjnt, wound wʉwnd, colt kəwlt, hold həwld, waste wɛjst, most məwst, oust awst, etc), but not before ft, where the first, sp, where the second, or mp, where both consonants are noncoronal. The fact that ŋ does not occur after diphthongs makes it behave like a noncoronal consonant cluster, say, ŋg.

There are two rather unexpected further sets of constraints. The diphthong aw only occurs before coronal consonants: shout ʃawt, loud lawd, couch kawtʃ, gouge gawdʒ, house haws, arouse ərawz, south sawð, Louth lawð, brown brawn, owl awl; but not before noncoronal consonants, ie no words with awp, awm, or awk. The distribution of oj is even narrower: it only occurs before alveolar consonants. We have exploit əksplojt, void vojd, voice vojs, noise nojz, coin kojn, coil kojl. Neither noncoronals (p b f v m k g ŋ), nor coronals that are not alveolar but dental (θ ð) or palatal (tʃ dʒ ʃ ʒ) are possible after oj.

There are no systematic constraints between a consonant and the following vowel in English, but we do find two strong tendencies after two types of consonant clusters. Cj clusters are typically followed by ʉw (cute kjʉwt), its pre-R version, (cure kjoː), or ʉ (accurate ákjʉrət), or ə (ákjərət). Cj occurs before other vowels only in loans like (pinyin pɪnjɪn). Cw, on the other hand, occurs before any vowel (qualm kwɑːm, dwell dwɛl, twig twɪg, quad kwɔd, twirl twəːl), but not ʉw or ʉ. Note that without the initial consonant, and are possible: yid jɪd, wolf wʉlf.

langauge change

One aspect of language change is sound change. We know, for example, that the diphthong aw was earlier ow and ever before that uw, so in Old English mouse was muws, which then became mows and later maws. The short ʊ has also changed in most words: OE hunt hʊnt became hənt. We also know that word-final noncoronal voiced plosives were lost when preceded by a nasal, so OE sɪŋg became sɪŋ and dʊmb became dəm.As the b was lost, people became uncertain of the spelling, eg they introduced a b in the written form of thumb, which in fact never ended in b. Such sound changes also affect phonotactic constraints, for example, they resulted in nd being the only word final nasal+voiced plosive cluster possible, since both mb and ŋg were lost at the end of words.

The distribution of h has also become radically reduced: in Old English this consonant occured word finally and before consonants (eg night was nɪht, sigh was sɪh, rough was rʊh).We can see that the spelling still indicates this by gh. It also disappeared in word-initial consonant clusters: OE hrɪŋg is now rɪŋ (ring), hlaːf is ləwf (loaf), and hwaːl is wɛjl (whale). It seems like the only cluster beginning with h that’s left in English is hj (eg in huge hjʉwdʒ), but in fact this cluster did not exist in Old English, together with all Cj clusters it is a new development.

Besides Cj clusters, Old English also lacked the contrast between voiceless and voiced fricatives. So the homorganic fricative pairs fv, θð, and szThe palatal ʃ and ʒ did not alternate in this way. were allophonic variants, the latter ones occurring within a word if not adjacent to a voiceless plosive, the former ones elsewhere. This ancient distribution is still reflected in modern English word pairs like five–fifty, bathe–bath, graze–grass, where the word-final -e’s represented a pronounced vowel in OE — preserved in the conservative spelling — so the voiced fricatives were not word final back then. With the loss of these word-final vowels the voiced fricatives were stranded and came to contrast with the voiceless fricatives that were word final earlier too, eg in believe–believe bəlɪjfbəlɪjv, wreath–wreathe rɪjθrɪjð, or close kləwskləwz.

loanword adaptation

Loanword adaptation often involves modification of the sound shape of words to fit the phonotactics of the receiving language. For example, English cannot have short vowels word finally (except for unstressed schwa). So any such vowel of donor languages will be repaired: typically diphthongized. French café kafé was adapted as káfɛj, Italian spaghetti spagɛ́tti as spəgɛ́tɪj, putto pútto as pʉ́təw,The geminate (long) tt’s of the two Italian words are also simplified in English, which does not have such constructions. Polynesian tabu as təbʉ́w, etc, but short a was lengthened, eg Spanish panamá is pánəmɑː.

We have seen above that some voiced fricatives came to occur word finally because the vowel after them was lost. But fv and sz also contrast word initially in English, which is not due to vowel loss, but to French and Greek loanwords, respectively. So the large majority of words beginning with v come from French (eg very, a minimal pair of ferry), and those beginning with z from Greek (eg zeal, a minimal pair of seal). Similarly, all English words beginning with or ending in ʒ are loanwords from French or Russian.

We may conclude that loanwords are adapted to satisfy the phonotactic patterns of the host language, but they also may force changes, introducing new phonotactic patterns.

last touched