NÁ70::GKiss:AspirationOfStopsAfterFricativesInEnglish…

ThisDear Ádám, I would like to wish you a very happy birthday! short squib investigates the aspiration of stops in current Standard Southern British English, with a special focus on stops occurring after voiceless fricatives. The paper presents the preliminary results of a pilot experiment in an area of English phonetics and phonology that seems to be relatively underinvestigated: the aspiration of stops after fricatives other than /s/. It is a commonplace in the literature that /s/ “inhibits” the aspiration of stops word-internally. But is it the specialty of /s/ only, or perhaps fricatives in general display this neutralizing behaviour? Much of this paper is admittedly about phonetic substance; however, it hopes to show how a data-driven, substance-based approach can pave the way for an account of the system of the contrastive phonological form.The author was sponsored by OTKA #104897.

2 Aspiration of stops in English: an overview

The laryngeal contrast of the stop obstruents /p t k/ vs. / b d ɡ/ in English is usually expressed with the help of two phonetic features: aspiration and voicing. These features are not equally active in cueing the contrast in every position. Phonetic voicing (phonation), i.e., the vibration of the vocal folds, plays a more limited role, it is usually only present when the stops occur between vowels or sonorant consonants (e.g., rebel, elbow, rider, bandit, cargo, ugly, etc.). In all other environments, stops are typically unphonated. In these positions, stops with the same place of articulation are contrastive mainly due to the presence or absence of aspiration (e.g., pike–bike, try–dry, cool–ghoul, etc.) – hence the classification of English as an “aspirating” language (Jansen 2004). We will follow the traditional English phonetic and phonological literature in referring to the two contrastive obstruent classes as fortis (‘voiceless aspirated’) and lenis (‘voiceless unaspirated’).Since they are not the focus of this paper, we will disregard other potential cues of the laryngeal contrast of English stops, such as duration-related cues (e.g., “pre-fortis clipping”), differences in low spectral features (f0, F1) of surrounding vowels, or potential articulatory effort-, intensity-related cues. Consequently, the terms fortis and lenis will be used here without making any reference to articulatory effort or acoustic energy, we will use them solely in order to distinguish aspirated-voiceless vs. unaspirated-voiceless stops. On the dubious and problematic articulatory strength-based interpretations of these terms, see Ladefoged and Maddieson (1996) and Docherty (1992), among others.

Since the classic work by Lisker and Abramson (1964) on the typology of laryngeal contrast in word-initial stops, aspiration has been frequently defined with the help of the timing relation between the release of the stop and the onset of phonation of a following voiced sound (a vowel or one of the approximants /l r j w/), which is referred to as Voice Onset Time (VOT). English stops can be grouped into two VOT categories: (i) long-lag VOT (aspirated-voiceless, or fortis) vs. (ii) short-lag or zero VOT (unaspirated-voiceless, or lenis). The cutoff point between long-lag vs. short-lag/zero VOT is conventionally placed at 35 ms (Keating 1984). Thus, a stop is said to be aspirated when the time between its release and the phonation of the following sound is more than 35 ms.

The VOT measurement usually includes the release noise. The release noise of a stop is acoustically different from the noise of aspiration, however. In its spectrum, a long range of frequencies have high intensity,The typical spectrum of the release noise varies according to the place of articulation of the given stop, though (Liberman, Delattre, and Cooper 1952). whereas the spectrum of aspiration resembles that of whispering or [h]. This is not surprising as the delay of voicing is achieved by wide glottal opening, in which the “vocal folds are markedly further apart than they are in modally voice sounds” (Ladefoged and Maddieson 1996, 70): phonation is inhibited if the vocal folds are widely abducted. [h] itself is acoustically characterizable as a whispered vowel, hence its spectrum (just like the spectrum of aspiration noise) often shows a relatively weaker noise plus formants mixed together. Aspiration may then be defined through timing relations (VOT) or glottal aperture, but they are the two sides of the same coin, in effect. As we said above, if the aspiration noise is less than 35 ms, the stop is not considered to be aspirated, which is probably justified on perceptual grounds: the release noise and the aspiration noise are difficult to perceive as different acoustic cues, and so a short-lag VOT is basically just a longer release noise acoustically.

The well-known distribution of aspiration of fortis stops in standard British English can be summed up as (1) below (see, among many others, Nádasdy 2003; 2006):

Thus, the underlined stops in (2) are aspirated as they are word-initial (these and the examples below are from Nádasdy 2003, 20):

Note that here we treat word-initial stops followed by an unstressed vowel (políte, togéther, collápse etc.) just as aspirated as those followed by a stressed vowel.Thus, we will rely on the aspirated vs. unaspirated dichotomy here, and do not further subdivide the aspirated category into ‘strongly aspirated’ vs. ‘weakly aspirated’ (such as Nádasdy 2003 or Cruttenden 2014). Note also that according to some authors, including Nádasdy (2006, 59), there is no aspiration in word-initial fortis stops that are followed by an unstressed vowel. For a summary of how some of the well-known textbooks and dictionaries treat aspiration in English, see Gefferth (1997).

The underlined stops in (3) are also all aspirated as they are in the initial position of a stressed syllable:

However, the underlined stops in (4) are not aspirated because they are neither at the beginning of the word nor at the beginning of a stressed syllable:

Every textbook on English phonetics and phonology mentions one regular exception to the above:

Therefore, the following underlined stops are not aspirated even though they are word-initial or in a medial stressed syllable:

The stops standing after /s/ in (6) can only be voiceless and unaspirated (i.e., lenis), therefore, the contrast between fortis and lenis stops is neutralized here (see, for instance, Cruttenden 2014, 47). This means that the stop after /s/ in speak for instance is supposed to be the same phonetically as the b in beak.And so if English spelling were consistent, speak should be spelt as sbeak. According to Cruttenden (2014, 47, 164), “words beginning /sp-, st-, sk-/ are not contrasted with words beginning /sb-, sd-, sɡ-/, although a distinction sometimes occurs word-medially, as in disperse/disburse and discussed/disgust; […] the aspiration […] may be lost but nevertheless a distinction may remain […] based on strength of articulation alone”. However, as we noted above, the articulatory strength-based definition of the terms fortis and lenis is problematic; for example, it is unclear how exactly “articulatory strength” is manifested in the acoustic signal as a perceptual cue to signal laryngeal contrast. The issue of the phonetic qualities of post-/s/ stops in English is far from being settled. There is some experimental evidence favouring the hypothesis that these stops preserve some traces of fortisness, such as the low spectral features (f0 and F1) of the following vowels (Wingate 1982). However, most perception-based studies (especially those that did not involve the misleading effect of English orthography) have brought up evidence supporting the hypothesis that these post-/s/ stops are perceived as lenis stops by native speakers.For a good summary on these issues, see González (2006).

Note that it is possible for stops to be aspirated after /s/, but when this is the case, it usually signals a word boundary (thus, aspiration can be thought of as one way to cue word boundaries for speakers). All the stops following /s/ in these words are usually aspirated:Aspiration after /s/ shows variation, though. The variation usually correlates with the token frequency of a given word. See, for example, Zuraw and Peperkamp (2015).

The well-known textbooks and dictionaries only mention the post-/s/ position as neutralizing but the question arises whether it is a “specialty” of /s/ only, or perhaps there is a general incompatibility between aspiration and all fricatives, not just the alveolar sibilant. To find an answer to this question is, however, not easy as the only other fricative+voiceless stop morpheme-internal cluster in English is /ft/, and even this cluster occurs in just a few words:Search results are from cube.elte.hu.

There are no words beginning with /ft/. Most of the words have /ft/ in word-final position or followed by an unstressed syllable – in these we do not expect aspiration. The only words really (marked with bold in (8)) which we may use to test the presence vs. absence of post-/f/ aspiration are fifteen and the British English pronunciation of lieutenant (Gimson-IPA: /lefˈtenənt/), as they are the only words that have /ft/ followed by a stressed syllable, and therefore may potentially be aspirated. According to Wells (2000), the word caftan (alternative spelling kaftan) has the primary stress on the first syllable in standard British English (Gimson-IPA /ˈkæftæn/ or /ˈkæftɑːn/ ), but he gives the alternative transcription /kæfˈtæn/ for General American English (the other, supposedly mainstream, pronunciation in GA is /ˈkæftən/). Consequently, any investigation into aspiration after fricatives other than /s/ can only really use two test words: fifteen and lieutenant.

3 Pilot experiment: subject, material, method

The experiment presented in this paper was a pilot experiment, as such, its main purpose was to test the feasibility of future, larger-scale research of the aspiration of English stops after fricatives other than /s/. The experiment only included one subject, and therefore, the results must be treated with caution as they cannot be used to estimate the true value of aspiration in the population of English speakers because no inferential statistics can be calculated based on only one speaker. Nonetheless, the results from this speaker can be used to describe tendencies and signal ways of research in this area in the future.

The subject of the experiment was a young male (in his early 20s), a university student, a native speaker of what we may call contemporary standard Southern British English. He was not aware of the purpose of the study, was not paid for it, and he agreed to using the material for phonetic research.

The pilot experiment discussed in this paper investigated the VOTs of the following sounds in the following positions (uppercase “V” stands for a stressed vowel, while lowercase “v” stands for an unstressed vowel):

The test words were placed in carrier sentences, whose lengths were kept relatively stable for the three main positions listed above. The vowels following the stops were also the same per position: /eɪ/, /aɪ/, and /iː/. The words with the word-initial target stops were all utterrance-initial (9); all the other test words were in absolute-final position (10)–(11). The sentences were all declarative, with a similar intonation structure. All these aimed to ensure that sentence length, vowel quality, and sentence type did not potentially influence the target variable of the experiment, the VOT.

The subject read the test sentences and additional filler sentences from a monitor screen in a randomized order, which was generated by SpeechRecorder.www.phonetik.uni-muenchen.de/Bas/software/speechrecorder/ Each test sentence was read five times, but the first reading was considered as the familiarization phase, and was not taken into consideration. The recordings were made in a relatively noise-free room. Since only duration measurements were planned, it was not necessary to carry out the experiment in a special, sound-attenuated room. The microphone used was a Sony ECM-MS907, which was connected to a laptop through an M-Audio MobilePre USB preamplifier external sound card. The material was recorded at a 44,100 Hz sampling rate, mono. Again, since no spectral measurements were taken, the material was not resampled and/or filtered.

The experiment contained 20 test words, which were repeated 4 times, and so there were altogether 80 observation scores that could be used for analysis.

The segmentation of the test words and the VOT measurements were carried out in Praat, version 5.4.18 (Boersma and Weenink 2015). The VOT intervals were created manually with the help of the waveforms and spectrograms. The left boundary of each VOT interval was between the silent stop-phase and the start of the release noise. The right boundary of each VOT interval ended just before the first period of the following vowel’s periodic waveform and its formants were visible (Figure 1). A Praat-script was used to automatically write out the VOT values in milliseconds into a data set table. The descriptive statistical analyses of the VOT scores, including graphs (boxplots and barcharts), were carried out using R, version 3.3.1 (R Development Core Team 2008).

4 Results

Table 1 below summarizes the mean VOT values and standard deviations from each mean for each environment, as well as the medians.Post-tonic /t/ was measured in two words (writer and forty), hence the environment abbreviations “Vtv” and “Vtv2”.

Env’ment	Test word	Mean VOT (ms)	St. dev.	Median
#pV	páces	54.25	4.03	54.0
#bV	básic	11.25	2.22	11.0
#tV	táble	75.00	14.88	72.0
#dV	dáta	11.25	2.75	11.5
#kV	cáble	55.75	6.18	53.0
#gV	Gáble	16.00	2.16	16.5
Vpv	píper	35.75	12.55	33.5
Vbv	bríber	14.75	8.18	14.5
Vtv	wríter	79.75	3.20	81.0
Vdv	ríder	11.25	5.56	11.0
Vkv	híker	55.25	6.45	54.5
Vgv	tíger	19.25	4.86	20.0
vtV	fourtéen	87.75	5.62	87.0
Vtv2	fórty	90.75	6.13	90.5
ftV	fiftéen	38.50	24.53	28.5
Vft	fífty	30.75	7.63	29.5
stV	sixtéen	17.75	6.65	17.5
stv	síxty	30.75	2.50	32.0
ntV	seventéen	85.50	14.01	84.0
vntv	séventy	85.50	14.80	85.5

Figure 2 shows the range of VOT duration values for each environment (dots represent extreme values). The interrupted lines separate the data into three environments: (i) word-initial before a stressed vowel (/p b t d k ɡ/), (ii) medial after a stressed vowel (/p b t d k ɡ/), and (iii) medial-/t/ before or after a stressed vowel in 4 positions: intervocalic, after the voiceless fricatives /f/ and /s/, and after /n/. As discussed above, the conventional cutoff point between zero or short-lag VOT (no aspiration) and long-lag VOT (aspiration) is 35 ms (Keating 1984; Jansen 2004), this is shown by the horizontal blue line.

The bar charts below show the mean VOT durations for each environment (error bars represent the standard deviations from the means):

Based on the two figures, the word-initial fortis stops /p t k/ can be considered aspirated, all three have relatively long VOT scores (well above the 35-ms threshold). The lenis stops /b d g/ are not aspirated, all having relatively low VOT durations.

In the medial, posttonic position the subject articulated the fortis stops /t/ and /k/ with a similar VOT as in word-initial position. The length of VOT in the case of the labial fortis stop /p/ was shorter, values ranged between 24 ms and 52 ms (mean: 35.75 ms, standard deviation: 12.55), thus his aspiration of the second /p/ in píper showed some variation between unaspirated and aspirated tokens. Note that most descriptions of standard British English describe fortis stops in an unstressed syllable as unaspirated. The subject in this experiment was clearly aspirating (the non-labial) fortis stops in unstressed syllables, too. Figure 4 shows the spectrogram of one of the articulations of the test word híker, in which the medial /k/ has a long-lag VOT:

Let us turn now to medial /t/. In intervocalic position, both before and after a stressed vowel, /t/ had a long VOT. As expected, /t/ exhibited a short-lag VOT after /s/ (both before and after a stressed vowel); thus, we can characterize the stop here as unaspirated.

The other post-fricatival position showed variation, especially the ftV position (fifteen). Most scores had a relatively short-lag VOT (around or below 35 ms), this suggests that /t/ after /f/ has a very similar realization as post-/s/ /t/: it is unaspirated. However, in one realization of fifteen, the VOT of /t/ was rather long: 75 ms (this score is the extreme value in the boxplot above), thus, this /t/ was clearly aspirated. Figure 5 shows two realizations of fifteen: short-lag VOT/unaspirated and long-lag VOT/aspirated:

The post-nasal position (seventeen–seventy) behaved just like the intervocalic position: the /t/ had a long-lag VOT here, it was aspirated, both before the stressed and the unstressed vowel.

There is one important note in order here. The post-release realization of the voiceless portion before the onset of voicing of the following vowel showed acoustic differences between the fortis stops: in the case of /t/, the noise often indicated the presence of affrication, an [s]-like spectrum, rather than a [h]-like spectrum. This [s]-like frication noise was often relatively long, which is probably the reason for the VOT values of /t/ being relatively long as well. As Figures 2 and 3 above show, it was always /t/ that had the longest VOT in all the non-pre-fricatival environments, /t/ clearly showed a positively skewed VOT. It is debatable whether this noise portion should be at all considered as aspiration or as part of a VOT rather than a [s]-like frication release following stop closure, hence an affricate [ts]. In some cases even the stop closure phase seemed to be absent and only a long, [s]-like fricative was observable (Figure 6). More thorough acoustic research should uncover the spectral properties of the post-release/pre-voicing noise: if it turns out to be affrication, it should be treated differently from aspiration. The fact that the subject always articulated /t/ with a relatively long noisy portion regardless of the stressing of the following vowel suggests that we are perhaps dealing with a general affrication of /t/ here. It is noteworthy, however, that just like aspiration, this supposed general affrication is very short after /s/ (and to a certain extent, after /f/) – which indicates that aspiration and affrication behave similarly after fortis fricatives. Again, further acoustic experiments must uncover if the release noise of /t/ vs. /p/ and /k/ has different spectral properties and length after /s/.Cruttenden (2014) mentions that “/t, d/ are especially liable to affrication and even replacement by the equivalent fricative in weakly accented situations” (Cruttenden 2014, 178).

5 Conclusion and outlook

The results of the pilot experiment presented in this paper have shown that a larger-scale experiment on post-fricatival aspiration of stops in English is a worthy endavour to pursue. First of all, the results indicate that stops tend to be unaspirated after /f/, too, just like after /s/. The variation may be greater after /f/ than after /s/, however. A possible reason for this variation may be the very low number of words that contain (non-final) /ft/: unlike in the case of /s/+stop clusters, there is no established pattern to follow for speakers in words with /ft/, and so, there is no clear categorization (aspirated vs. unaspirated) of the stop here, although the results of this pilot experiment indicate that deaspiration is perhaps the preferred choice after /f/, too.

Secondly, the question still remains why it is /s/ after which aspiration is generally not allowed, stops in this position must be lenis, with a short-lag VOT. If a larger-scale experiment shows that this also includes /f/, then the question may be rephrased like this: what makes fricatives and aspiration incompatible with each other? Most explanations of the distribution of aspiration have relied on syllable structure (see our own definition in (1)). This paper does not wish to evaluate those approaches in detail here, we just mention one problem with the syllabic approach. The syllabic explanation claims that stops are not aspirated after /s/ because they are in fact not word- or syllable-initial: it is /s/ which is inital, in other words, /s/ and the stop are tautosyllabic: port vs. sport, en.cóu.rage vs. di.scóver (syllable boundaries are indicated by dots). This syllabification – even though it violates the sonority sequencing principle: /s/ is more sonorous than the stop – may be supported by the fact that /s/ + stop clusters do occur word-initially, too. However, the syllabification fi.ftéen is difficult to back on these grounds as there are no words beginning with /ft/.Some syllabic analyses argue that word-initial /s/+C clusters are actually heterosyllabic, with /s/ being in the coda preceded by an unrealized nucleus (and onset) in a “degenerate” syllable (for evidence on this, see Kaye 1992, Harris 1994, 54–63, Gussmann 2002, 107–117). This is of course problematic for the syllabic analysis of aspiration as now, post-/s/ stops would be syllable-initial, and therefore, they should be aspirated.

Non-syllabic phonological models, such as phonetically-grounded approaches, especially those relying on perception of contrast, exemplar models and analogical models seem to fare better at explaining why aspiration and fricatives are incompatible with each other. A perception-based approach that seems to be worth pursuing hypothesizes that the turbulent noise that acoustically characterizes fricatives, especially sibilants like [s], is similar to that which appears during aspiration, and consequently, as Silverman (2006) writes, “it is probably not so easy to reliably distinguish [st] from [stʰ] in running speech, and languages may tend to eliminate this contrast should it arise, especially within a word, rather than between words” (Silverman 2006, 176–177). The result of the present experiment that aspirated /t/ is often affricated as [ts] and even replaced as [s] supports the idea that fricative noise, especially that of sibilants, is similar to aspiration noise.

Some of the future research questions that are worth pursuing within these, “usage-based” frameworks are the following: why is it the sibilant fricatives that seem to be most incompatible perceptually with aspiration?,For some preliminary answers in a perception model, see Wright (2004). how does token and type frequency influence the occurrence of aspiration after fricatives?, why does aspiration evolve into affrication and why does it occur mostly in unaccented syllables?, is there aspiration after /s/ (and the other fricatives) across a word boundary?, is there variation in aspiration in this position, too? The sheer number of unresolved questions show that this area of English phonetics and phonology is worth further investigating.

References

Boersma, Paul and David Weenink. 2015. “Praat: Doing Phonetics by Computer.” www.praat.org/.

Cruttenden, Alan. 2014. Gimson’s Pronunciation of English (8th Edition). London & New York: Routledge.

Docherty, Gerard J. 1992. The Timing of Voicing in British English Obstruents. Berlin & New York: Foris.

Gefferth, Katalin. 1997. “The Distribution of Aspiration in English.” The Odd Yearbook 4: 3–11.

González, José Antonio Monpeán. 2006. “The Phonological Status of English Oral Stops After Tautosyllabic /s/: Evidence from Speakers’ Classificatory Behaviour.” Language Design: Journal of Theoretical and Experimental Linguistics 8: 69–101.

Gussmann, Edmund. 2002. Phonology: Theory and Analysis. Cambridge: Cambridge University Press.

Harris, John. 1994. English Sound Structure. Oxford & Cambridge, MA: Blackwell.

Jansen, Wouter. 2004. “Laryngeal Contrast and Phonetic Voicing: A Laboratory Phonology Approach to English, Hungarian, and Dutch.” Doctoral dissertation, Rijksuniversiteit Groningen.

Kaye, Jonathan D. 1992. “Do You Believe in Magic? The Story of s+C Sequences.” SOAS Working Papers in Linguistics & Phonetics 2: 293–313.

Keating, Patricia A. 1984. “Phonetic and Phonological Representation of Stop Consonant Voicing.” Language 60: 286–319.

Ladefoged, Peter and Ian Maddieson. 1996. The Sounds of the World’s Languages. camox: Blackwell.

Liberman, Alvin M., Pierre Delattre, and Franklin S. Cooper. 1952. “The Role of Selected Stimulus-Variables in the Perception of the Unvoiced Stop Consonants.” The American Journal of Psychology 66: 497–516.

Lisker, Leigh and Arthur Abramson. 1964. “A Cross-Language Study of Voicing in Initial Stops: Acoustical Measurements.” Word 20: 384–422.

Nádasdy, Ádam. 2003. Practice Book in English Phonetics and Phonology. Budapest: Nemzeti Tankönyvkiadó.

Nádasdy, Ádám. 2006. Background to English Pronunciation. Budapest: Nemzeti Tankönyvkiadó.

R Development Core Team. 2008. R: A Language and Environment for Statistical Computing. Vienna: Foundation for Statistical Computing. www.R-project.org.

Silverman, Daniel. 2006. A Critical Guide to Phonology: Of Sound, Mind, and Body (Contiuum Critical Introductions to Linguistics). London & New York: Continuum.

Wells, John C. 2000. Longman Pronunciation Dictionary. Harlow: Longman/Pearson Education.

Wingate, Anne H. 1982. “A Phonetic Answer to a Phonological Problem.” UCLA Working Papers in Phonetics 54: 1–27.

Wright, Richard. 2004. “A Review of Perceptual Cues and Cue Robustness.” In Phonetically Based Phonology, edited by Bruce Hayes, Robert Kirchner, and Donca Steriade, 34–57. Cambridge: Cambridge University Press.

Zuraw, Kie and Sharon Peperkamp. 2015. “Aspiration and the Gradient Structure of English Prefixed Words.” In Proceedings of the 18th International Congress of Phonetic Sciences, edited by The Scottish Consortium for ICPhS 2015, Paper number 0382.1–5. Glasgow: University of Glasgow.

Aspiration of stops after fricatives in English:
Results from a pilot experiment

1 Introduction

2 Aspiration of stops in English: an overview

3 Pilot experiment: subject, material, method

4 Results

5 Conclusion and outlook

References

Aspiration of stops after fricatives in English:Results from a pilot experiment

1 Introduction

2 Aspiration of stops in English: an overview

3 Pilot experiment: subject, material, method

4 Results

5 Conclusion and outlook

References

Aspiration of stops after fricatives in English:
Results from a pilot experiment