NÁ70::Newson:Squiblets

There are a few ideas I have been playing with for several years and which often find their way into my lectures, but they are too small to develop into a paper of their own,So small that they don’t even deserve the title ‘squib’, which is why I’m calling them squiblets here. too diverse to be collected into a coherent paper and, let’s be frank, just too odd to be included in any other paper. And yet, I feel they should have some outing and so I am grateful for this opportunity to be able to get them off my chest.

Distribution and why it leads to the opposite conclusion to the one everyone else normally comes to

As we all know from first year linguistics classes, distributional arguments for constituent structure are the bread and butter of syntax. They go like this: if a string of words abc has the same distribution as another string of words xyz, then these strings form the same constituent; therefore they are constituents; therefore constituents exist! This argument makes the assumption that distribution indicates a well defined set of syntactic positions within a sentence and as something which is bigger than a word can have a distribution, syntactic positions extend to phrases as well as words.

Now if this were all there was to the matter we would expect a very well behaved system in which everything with the same distribution would be a constituent, because it occupies a syntactic position, and furthermore, everything which is in complementary distribution would be the same constituent, because, clearly, two things cannot occupy a single position at once. We could also predict that if two things, A and B, are in complementary distribution and two more things, B and C, are also in complementary distribution, then it must follow that A and C should be in complementary distribution. This is because A, B and C must all occupy the same structural position, given their complementary distribution patterns.

Unfortunately, it is at this point that we see that things are not so well behaved after all. There are numerous cases where this pattern of complementary distribution fails to work. I will exemplify from both English and Hungarian.

In English, inverted auxiliaries are in complementary distribution with complementisers (1a) and complementisers are in complementary distribution with fronted wh-phrases (1b). Therefore, by the logic that complementary distribution indicates that things occupy the same position, it follows that inverted auxiliaries should be in complementary distribution with fronted wh-phrases. But they are not (1c):

(1)

a.	had I known …	/	if I had known …	/	* if had I known …
b.	the man [who I met]	/	[that I met]	/	* [who that I met]
c.	who will you meet?

In Hungarian, the negative particle and pre-verbs are in complementary distribution (2a) and pre-verbs and preverbal foci are in complementary distribution (2b). Therefore, the negative particle and preverbal foci should be in complementary distribution. Again, not true (2c):

(2)

a.	elment	/	nem ment el	/	* nem elment
	away-went-3sg		not went-3sg away
	‘He/she left’		‘He/she didn’t leave’
b.	elment János	/	János ment el	/	* János elment
	away-went-3sg John		John(foc) went-3sg away
	‘John left’		‘It was John who left’
c.	János nem ment el
	‘It was John who didn’t leave’

There are numerous other cases like this, but let these examples suffice to make the point.

These observations are not particularly new and neither have they caused syntacticians to throw their hands up in horror. The typical response is to admit that things are not so straightforward and that the complementary distribution between at least two of the elements involved in these otherwise paradoxical patterns is not caused by them occupying the same structural position. For example, the complementary distribution between the fronted wh-phrase and the complementiser is not assumed to be because the fronted wh-phrase occupies the complementiser position. Instead, it is accounted for by some other syntactic principle not based on the notion of structure at all. These principles are typically expressed as restrictions on linear order. So, for example, the ‘Doubly Filled COMP Filter’ (Chomsky and Lasnik 1977) states that it is ungrammatical to have an overt wh-operator and an overt complementiser in the same COMP. Translated into more current ideas concerning the complementiser system, this essentially says that we can’t have an overt wh-operator in the specifier of CP at the same time as having an overt head of CP. As there is no structural reason for this restriction, it boils down to a restriction that a wh-phrase cannot immediately precede a complementiser.

However, notice the strategy applied here. Structural accounts of distribution clearly don’t work by themselves. Therefore we add linear restrictions to the structural accounts to plug the gaps and we end up with a system which makes use of both structural and linear mechanisms. In other words, we have gone from a simple system to a more complex one.

But there is an alternative which no one seems to have considered. If structural accounts are not adequate without support from linear mechanisms, might it not be worth exploring the possibility that linear accounts might work all by themselves? This would at least avoid going for the complex option of adopting both mechanisms and so it should be the null assumption, to be abandoned only when shown to be inadequate. The current attitude, however, seems to be to assume the inadequacy of a purely linear approach without serious investigation into the matter. This is not a particularly healthy situation.

Therefore I surmise that distributional observations, instead of pointing towards an inevitable conclusion that constituent structure exists, actually indicates quite the opposite. The assumption of constituent structure is made despite distributional evidence rather than because of it and if we were to do things properly we really should be investigating the assumption that there is no constituent structure before conceding to the worst hypothesis: that both structural and linear principles have a role in syntactic systems.

Where phrases came from

I have to admit to having been a little more than surprised on discovering that the notion of the phrase was probably devised sometime in the 1920s. While I knew that the notion of constituent structure as the basis of the analysis of the complete organisation of a sentence was proposed by the American Structuralists, I supposed that some notion of the phrase must have predated that, even if it wasn’t built into a full structural system.

However, when one looks into books on the history of linguistics, such as R. H. Robins’ classic A Short History of Linguistics (1967), even though they identify certain ideas as being the forerunners of the notion of the phrase, it is clear that these notions have more to do with the notion of dependency. Dependency grammars are not phrase structure grammars.

Indeed, in Bloomfield’s book An introduction to the study of language (1914) the word ‘phrase’ appears twice and both times it refers to what we would call an idiom these days. His definition of the subject is the word which carries this function. Clearly he had no notion of the concept at that time. Yet in 1933, with the publication of his expanded text book Language, the chapters on syntax are based on full blown IC analyses with phrases galore. So it was at some point between the publication of these two works that the notion first emerged.

The publication which first forwarded the phrase appears not to be very celebrated, as I can find no reference to what it was, which is strange given its extreme centrality to virtually all syntactic theories which have surfaced since its introduction. Given that I cannot find the source of the notion, and so I do not know what justifications for its existence were proffered, I am left to try to figure them out from what we know of the development of linguistics at the time. Here is what I think happened.

Bloomfield, we know, had a lot of respect for Boas, who he once referred to as “the teacher in one or another sense of us all” (Robins 1967:207). Boas’s ‘field methods’, a lot of which involve distributional analyses, are a clear forerunner to Bloomfield’s notion of ‘discovery procedures’. Presumably, given that distributional analyses are still part of the justification of phrase structure today, it was distributional analysis that caused Bloomfield to discover the phrase in the first place. But why then and not before? Linguists had been using distributional analyses prior to the 1920s, so presumably there must have been another trigger.

The other well known influence on Bloomfield was, of course, Behaviourism. Interestingly, Bloomfield became interested in Behaviourism during the same period that he came up with the notion of a phrase. One of the driving forces behind Bloomfield was his desire to make Linguistics scientific and to this end he had aligned himself with the father of experimental psychology Wilhelm Wundt, who was attempting to do the same thing in psychology. It was under this influence that Bloomfield wrote his introduction of 1914. Unfortunately Wundt’s introspective methods became discredited and Bloomfield felt the need to distance himself. He was therefore looking for something to take its place. It is hardly surprising that he found Behaviourism attractive for this purpose, with its extreme empiricist point of view. What better way to make a study scientific than to base it totally on observable data? The problem is, of course, that so little in linguistics is actually observable. We can observe and measure sound, but things like phonemes, morphemes and words are not directly observable in the sounds themselves. Not to be put off, however, Bloomfield developed a system of grounding more abstract notions in phonetic observations through his discovery procedures. Thus while sounds and combinations of sounds are observable, through analysing the distributions of these we can define units such as the phoneme, the morpheme and the word. This leads us to a view of the organisation of the linguistic system which is still familiar today:

It is not too difficult to see how the discovery of the phrase is a simple extension of this progression. Applying distributional analyses to words and sequences of words we will not leap directly to sentences, but to something between words and sentences. And thus the phrase was born.

Now, if this recreation of historical events bears any similarity to reality, what I find interesting is that the two ingredients from which the phrase was baked, discovery procedures and the extreme empiricist stance of Behaviourism, are two aspects of Bloomfield’s work that have been most heavily criticised. Yet the result of these has been embraced to such a degree that it is now considered an unquestionable fact of language: Hornstein, Nunes and Grohmann state that the notion of units larger than words and smaller than sentences (“i.e. phrases”) are one of the six “Big Facts” by which we can judge the adequacy of a syntactic theory (2005:7). Any theory which attempts a syntactic investigation without assuming the existence of the phrase will therefore be inadequate before it starts. To me, this seems extreme, particularly given the somewhat dubious pedigree of the notion. While it is not impossible to derive truths from false premises, I think one should take such truths with a small pinch of salt until we know more about the alternatives. Certainly they should not be elevated to the status of ‘god given’ as seems to be the case in some people’s minds with the phrase.

References

Chomsky, Noam and Howard Lasnik (1977) Filters and Control, Linguistic Inquiry 8, 425–504.

Robins Robert Henry (1967) A short history of linguistics. Bloomington and London: Indiana University Press.

Bloomfield, Leonard (1914) An introduction to the study of language, New York: Holt

Bloomfield, Leonard (1933) Language, Chicago: University of Chicago Press.

Hornstein, Norbert, Jairo Nunes and Kleanthes K. Grohmann (2005) Understanding Minimalism, Cambridge: Cambridge University Press.