Expectations and Linguistic Meaning - Introduction
|Below, I present the first section of the Introduction in full text, however without footnotes. Section 2 of the Introduction has been split and placed together with the relevant chapter. Finally, there is an Acknowledgments section.|
1. Semantics is conventionalized pragmatics
This thesis focuses strongly on the pragmatic foundations of language and on how meaning on a semantic level can be fruitfully built upon pragmatic meaning systems. Syntax is similarly seen as capturing useful generalizations from the semantic level.
Linguistic meaning is based on nonlinguistic experiences. It is important to consider the connection between meaning as it appears in language and in other practices. This connection is modeled in terms of knowledge based on expectations, which are weak assumptions about the environment that function as "working hypotheses" - they are kept as long as they are useful and then discarded.
1.1. Approaches to language
The view that I present is in contrast to most traditional research in linguistics and philosophy. There, semantics and syntax are looked at in isolation from pragmatic and extralinguistic phenomena. The cognitive representation of language is also overlooked in the traditional accounts.
However, in the last 10 -15 years a strong school of cognitive linguistics has emerged that bases meaning on conceptualization, and stresses the relation of language to our mental representations (Lakoff 1987; Langacker 1986; Gärdenfors 1993).
The argument will be largely consistent with cognitive linguistics. However, with respect to the concept of meaning, I go one step further and base meaning on prelinguistic and even noncognitive experiences. Especially in an evolutionary setting, it will prove easier to model the connection between prelinguistic and linguistic behavior if one grounds linguistic representations in prelinguistic meaningful activities.
It then becomes important to consider the biological foundations of knowledge systems, and a useful distinction will be between the subjective and intersubjective motives for acquiring knowledge (following Trevarthen 1980:326 -327). All organisms have subjective needs for knowledge in some form. There are some things in the world that are inherently meaningful to organisms, for example food and mates. The organism can gather knowledge through its sensory organs to help predict where the meaningful stuff is. The subjective motives for knowledge are basically the same for us humans. People, however, also have intersubjective cognitive capabilities that distinguish our cognition from that of lower organisms. The intersubjective motives include to communicate, to seek company, reciprocal give and take, to express confusion if others become incomprehensible, etc. (ibid.). The faculties needed for fulfilling the intersubjective motives all depend crucially on forming expectations about others' mental states.
A second focus that is not always clear in cognitive linguistics is on the distinction between systems that represent the world around them as faithfully as possible to form a true "mental image" and systems that only represent default assumptions and exceptions to these defaults. I will call the latter systems expectation-based, and maintain that cognitive systems are of this kind.
Expectation-based models emerge now and then in the cognitive sciences, and among the most well-known are the models in terms of frames (Minsky 1975) and scripts (Schank and Abelson 1977). (See Tannen 1979 for a review in a linguistic context.) However, these models concern the general organization of knowledge, while my model focuses specifically on the relations between expectations, language and meaning.
My reason for choosing expectations as fundamental for my analysis is that expectations give a new starting point for a discussion of linguistic meaning connected to many central concepts in cognitive science, such as induction, inference and affordances. I have not attempted to reduce my reasoning to rigid taxonomies of expectations. My aim is to open the field of linguistic meaning for discussion for researchers from many different disciplines, and such taxonomies tend to harmfully reduce the dimensionality of the field under discussion. However, a very rough classification can be made between (1) expectations about other people and their thinking and (2) expectations about features of things in the environment.
Language has evolved as an intersubjective tool for sharing subjective meaningful experience and extending the meaningful sphere. Language is a reaction to what speakers need to express in the situation, rather than a predefined object that we can study independently of the situation of use. In this view, the tools that language provides us with cannot be taken for granted, but we must seek their origins in the communicative situations in combination with a gradual conventionalization and decontextualization that proceeds from pragmatics to semantics and further to syntax.
To sum up: if we want to leave the traditional position of meaning as situated in language, then there are several interacting studies that must be pursued in parallel. It is for example not enough to focus on our mental representations, because as I argue in section 1.5 and Paper Four, meaning in language is based on meaningful activities that lie outside our cognitive representations. If we have no theory about how the mental representations are related to our socio-cultural practices, the task of establishing links between mental representations and language will be very hard. I will next outline one of the components that will be necessary when we want to see how language is related to nonlinguistic action.
1.2. "The obvious goes without saying"
If we consider language to depend crucially on socio-cultural practices, then we, as scientists of language, must be aware of our own place in time. The language that we use is adapted to our environment. Changing the environment, e.g. by introducing new artifacts, will change our discourse as a response to the uncertainty surrounding the new. Introducing electricity in a society will cause a lot of linguistic output concerning the innovation during a period of time, and it will be necessary to distinguish homes and companies with and without electrical installation. But as soon as electricity becomes familiar we will no longer have a need to distinguish electrified from unelectrified.
Or if it is important for you to buy a certain brand of violin strings, you will have to distinguish the good shops from the bad shops, and this distinction can provide the basis for concepts in language. One day when all the shops sell your favorite strings you will not have to make the distinction between the two kinds of shop, and the support for a possible linguistic concept will disappear - we will not be able to talk about the difference.
In both these cases, we have a real-world event - e.g. your need for strings - that generates a breakdown - when you cannot find a string in a shop - that triggers a distinction between two kinds of shop, which can possibly become the foundation for linguistic categorization. But it may as well remain untalked about and never leave the level of our practical knowledge about the world.
1.3. Expectation-based categorization
However, you don't go into any shop to buy your violin strings. Strings are likely to be sold where violins are sold, or at least in shops that have something to do with music. Shops seldom advertise everything they sell, but people anyhow have a fairly accurate knowledge about what they can buy where. This knowledge is based upon expectations that we form as a preparation for an uncertain future.
People use surface characteristics to form expectations to suit their needs. In the example, a rough categorization of the shop as a shop for shoes, music, food, etc., will, together with the expectations that we have built up, guide us when we are looking for things that are useful for us.
The kind of knowledge involved in this expectation-based categorization is not certain or true in the same sense as philosophical knowledge. It is much more like prejudice in that we use any available surface property to form a whole set of inferred properties that we need for our everyday encounters with reality.
Expectation-based categorization is of course not limited to shops. When we meet people we use easily retrievable knowledge such as sex or age to form expectations about their occupation or interests.
An illustration is the story about the father and his son who suffer a car accident where the father dies and the son is seriously wounded and taken to the hospital. A physician comes to treat the patient, stops and says: "I can't do this. This kid is my son."
Most people find this story intriguing, and many give up before finding out that the female physician is the mother of the kid. There are very strong expectations associated with the concept "physician" that physicians are most likely to be male.
In my view, the main part of our knowledge consists of expectations that can be taken for granted in most situations, and is probably not coded in a language of thought. Sometimes, however, failed expectations generate breakdowns that surface as expressions in language. In the papers, I model several aspects of this process.
1.4. Dialogue dynamics
Let us think of the set of all our expectations as reaching a certain knowledge level. What is below the surface, we don't have to talk about. What we talk about is in some way above the surface, and all the time connected to our prior common knowledge. We cannot talk about things that are "up in the air." Then we must first raise the level of knowledge to make the connection. There is a constant mutual work going on in dialogue to determine where to situate the knowledge level, and this work consists of forming expectations about others' knowledge level.
When the participants misjudge the level, this will lead to breakdowns in the conversation. If we include the socio-cultural practices in the model, as I have done, we can also treat the nonlinguistic action as continuous with the linguistic discourse.
We see examples of these breakdowns when people of different expertise meet and try to talk to each other. I studied an "expert" instructing a "novice" how to change a string on a violin (Paper Two and Winter 1996). The two subjects couldn't see each other, and only the novice had a violin on which to perform the actual change. A violin is a fragile thing and the risk of breaking the string is always present. This makes the subjects use more language to complement their activities, to anticipate the breakdowns that could otherwise occur - in any situation of practical activity, there is always the possibility of performing the task without language.
That would of course be the most efficient solution - the novice performs the task without questions. Let us call this the ground level of action. See figure 1. When the participants cannot continue because they have come to a crossroads on their mental path, there is a breakdown on the level of nonlinguistic action, and they have to resort to discourse. The most primitive discourse level consists of the expert giving instructions to the novice - we can call this linguistic action. This works as long as they can take for granted a lot of things, among them the spatial orientation of the violin. If they disagree on their mental images of the violin they have to make a break at the linguistic action level to coordinate their representations. This could occur, for example, if the expert imagines the novice holding the violin with the scroll pointing to the left, when this is not the actual direction.
Another reason for choosing the violin setting was that the differences in expertise also include differences in vocabulary. Specialized violin vocabulary, like bridge, nut, scroll and tailpiece, is not commonly known, and if the expert uses such a word, this will lead to another break in the conversation, up to the level of linguistic labels. When the new knowledge has been integrated, the interaction proceeds again at the lower level. (See Paper Two.)
In another study I looked at helpline telephone conversations at a software engineering company. (Unpublished results.) In their case, the knowledge integration proceeds in several steps in the software design process, where the telephone support is the first step. In these interactions it is often the case that one specific problem is repeated several times, especially after upgrades or due to season changes (administrative software). It is then possible to delimit the problem and provide a written telefax answer, the second step in the process. The work of the support staff is in such a case reduced to recognizing the problem type. The third step consists of including the information in the telefax messages in the next edition of the manual, and the fourth and final step is to try to change the program so that this problem is avoided. The knowledge thus becomes more and more integrated and less exposed to treatment in discourse. In this process, I would like to point to the delimitation of the problem, which is contained in the second step, as being particularly important. It very much resembles a kind of concept formation. This kind of delimitation could provide the ground for word formation at the semantic level. (See also Clark 1992.)
1.5. Meaning as meaningfulness vs. meaning as signification
A fundamental assumption in this thesis is that the intersubjective knowledge in language is grounded in the kind of subjective meaningful cognitive systems that we share with lower animals. This view of meaning as meaningfulness can be contrasted with a view common to most of linguistics and philosophy. I will call it meaning as signification. They are contrasted (polemically) in the following table.
Here I will concentrate on meaning as meaningfulness. Every person explores his own sphere of meaningful experience during his life. This exploration is most often done by action - eating, drinking, moving, creating, playing, etc. - and through this action, more and more of the previously unrelated things in the world become related to his own systems of meaning.
A meaningful activity for me might be to eat or play the violin. The activity gives me satisfaction. If I talk about the meaning of a certain word, like "dog," on the other hand, I will point to a dog or try to explain with other words: "a common pet." This is meaning as signification. However, if the concept of meaning is to be of any use to us at all, it is no good that I explain the meaning of "dog" if I do not do it up to a point where I reach meaningfulness for you. If your culture has no pets, the explanation "a common pet" will not mean anything to you. All the words that we have in language reflect the underlying socio-cultural practices of our society.
One thing that makes us blind to meaning as meaningfulness is that we as researchers are always present to judge the meaning of the sentences that we analyze. As the semantic interpretation of a sentence in language is always immediately salient to us, we have great difficulty in judging what it would mean to have a system that uses language but does not interpret it. Harder (1991) notes for example that the grammaticality judgments that form the basis for transformational grammar are always accompanied by a semanticality judgment by the linguist.
In later years, however, much of science has been built on computer simulation, where the simulating system only manipulates the symbols without any understanding of the meaning attached. However, if we want to create autonomous systems capable of using language for real, we also have to model a meaning system. A symbol manipulation system without a meaning system will always be dependent on language users for interpretation (Harder 1991; Hutchins 1995:363; Stewart 1996:323).
Theories that build upon meaning as signification lack a meaning system, and their value is therefore limited. They are also strongly associated with the practice of studying written language, and it is not likely that oral societies have had the metalinguistic knowledge necessary to make use of theories of meaning as signification. The translation of linguistic concepts into other linguistic concepts - the principle of the dictionary - that these theories represent is of course useful for us literate people, in the same way that a dictionary is.
1.6. Perspectivism in science
To give a broader view of how the two perspectives above are related, let us consider an example. A supertanker passing through the Sound is as a real-world event not ambiguous or contradictory in itself. But when we try to study it, or even think of it, we always do that given a certain perspective. We tend to consider only one aspect at a time, and it is very hard to get a grip of the event as a whole. One of us might think of the possible environmental consequences, another of the profit made by the oil company, a third of the joy of driving a car fueled by the contents of the tanker, a fourth of the power of the engines or a fifth of the braking distance of the tanker. Associated with each perspective are also differing mental models, which generate different expressions in language, and different evaluations of the event (Andersson 1994). The perspective taken tends to affect the impression of the event as a whole. The moment when I think of all the gas that can take me and my car to exciting places, it is hard to consider the environmental consequences.
Each of these perspectives can be refined to a scientific perspective, but it is unusual to find scientific perspectives that try to take several vantage points and compare the results of these different analyses, at least when it comes to analyses that cover several of the traditional disciplines.
Thus, for the study of meaning, science has taken one and only one starting point at each time, and tried to approach meaning through that perspective. Meaning as meaningfulness is particularly difficult, since it does not have any specific locus, and probably no independent cognitive representation - it is embodied and embedded in our socio-cultural activities. Furthermore, meaningful activities are not primarily an object of discourse. Eating is for example not dependent on language for its successful outcome (Paper Four).
Meaning as signification exploits the omnipresence of the linguist and builds upon the visible and audible signs of language. It has the merit of putting the searchlight on many linguistic phenomena that it would have been impossible to discover in other ways, and the assumptions inherent in the perspective have been necessary to build up such systems as predicate calculus and grammar.
However, to understand deeper aspects of language use and the relation of language to categorization and cognition in general, meaning as signification is a perspective that is not good enough. (For a general critique, the reader is referred to e.g. Lakoff 1987; Linell 1982; 1996.)
Although I try to adopt a view of language that emphasizes meaningfulness and relation to socio-cultural practices in the present thesis, there is no coherent alternative for the study of meaning that would correspond to a full-blown scientific theory. The reason for this is the lack of candidate theories in the current scientific discussion, which has forced me to approach the problem of meaning from several different positions, and much would be achieved if the reader found some coherence in the views that emerge in the different papers. As the papers are written in contrast to different theories, I have also in some papers been forced to accept methodological assumptions that are challenged in other papers in the thesis.
1.7. Stabilizing structures
Proponents of traditional linguistics take the three functional realms of syntax, semantics and pragmatics for given and see them as a quasi-stable object of study. Adopting the reverse perspective, trying to build linguistic structures from meaning as meaningfulness and socio-cultural practices, also means taking the constitution of the different levels of pragmatics, semantics and syntax seriously, and considering what are the underlying cognitive processes. To do this is of course a giant project, and the outline I give here should be seen as very preliminary.
In the view that I have adopted in this thesis, pragmatics, semantics and (morpho-)syntax emerge as three functional realms with limited autonomy with respect to each other by a process that is perhaps best described as "conventionalization." Each level has more or less salient characteristics that can function as processing cues for the language users.
Morpho-syntactic conventions For example, word classes signal by common morphology that there are underlying similarities between words in the same word class: verbs look similar because their meanings are similar. Syntactic similarity functions in basically the same way: similar position of certain constituents in the phrase structure tells us that these constituents can have the same underlying semantics, and the syntactic structure in those cases functions as a cognitive processing strategy.
To code information with word order has several advantages. One could say that syntactic information conveyed by word order (and intonation contours) is parasitic on the words themselves - it doesn't add to the amount of information, only restructures the words to get the information through. (See Paper Two.)
Semantic conventions The same thinking can be applied to the realm of semantics. One example is the constraints that are proposed for determining the scope of a word. For example, infants seem to use a (possibly innate) cognitive processing strategy that corresponds to the assumption that a newly encountered word corresponds to a whole object rather than to a part or several objects. Thus, we have even weaker surface criteria: the existence of a word (a noun) signals underlying properties (the whole object) that are taken into account for language to work more efficiently than if the child were to consider all the possible meanings of the word.
Another example is the shareability constraints that we examine in Paper One. We start out from a paper by Jennifer Freyd (1983). The main theme of her paper is that knowledge, because it is shared in a language community, imposes constraints on individual cognitive representations. She argues that the structural properties of individuals' knowledge domains have evolved because "they provide for the most efficient sharing of concepts," and proposes that a dimensional structure with a small number of values on each dimension will be especially "shareable."
This kind of convention is like left- or right-side driving. Without the convention, people drive as they like, and must watch the other cars carefully to avoid collisions. With the convention, the driving is more efficient and the speed can be higher. However, the convention is not without disadvantages: it is for example not possible to take the shortest path to the destination. Such a convention can be considered more semantic than pragmatic, because it builds upon a mutual acceptance.
Pragmatic conventions The pragmatic conventions in turn have no surface criteria of the same kind as the semantic and syntactic ones. Natural candidates for pragmatic conventions would be the conversational maxims proposed by Grice (1975). However, the line of reasoning that Grice follows comes from a philosophical perspective according to which it was thinkable that a communication could exist that would not follow the maxim of relation (relevance). However, from an evolutionary perspective it is almost impossible to imagine a communication system that evolves without the relevance principle being built into the system at the most basic level. (See Paper Five and the note to that chapter.)
Instead, I would propose causal attribution as an example of a pragmatic area that could engender conventions capable of semantic strengthening. Attribution theory (Kelley & Michela 1980; Fiske & Taylor 1991) deals with attributing causes to unexpected events. Fiske & Taylor (ibid.) illustrates the process with Ralph and Joan who are out dancing, and Ralph is tripping over Joan's feet. The cause that Joan attributes to what happens (herself, Ralph or the circumstances) is determined by inferential processes that are largely nonlinguistic, but could be imagined to become the subject of semantic strengthening: Joan's repeated exposure to partners mistreating her feet could lead her to coin a word "tripper" to succinctly categorize Ralph and others of his ilk. (See Clark 1992.)
The conventionalizations in language all build upon expectations - generalized defeasible knowledge that can be overridden by situational exceptions. Before literacy they were most certainly not made explicit, but have nevertheless acquired a certain stability over time. The greatest stability is found in the syntactic conventions, to a lesser degree in the semantic, and even less in the pragmatic conventions.
Papers Two and Three model this kind of gradual conventionalization. In Paper Three, we give an example of these processes in the realm of linguistic modality. We trace modality back to social power structure combined with expectations of the attitudes towards the action to be performed. In the case of modals, the process of conventionalization has probably reached an endpoint - it has structural correlates at each level: the pragmatic processes of social power and expectations are lexicalized as words on a semantic level. But these words also have morphological and syntactic properties that can be used as a cue to underlying semantic (and pragmatic) similarities.
This can be seen as an endpoint of conventionalized knowledge, when the interactionally based social power becomes entrenched in the morpho-syntax. As I have argued, not all knowledge becomes lexicalized in this way. I believe this field to be very fruitful for further research.
The study of modal verbs thus concentrated on this fairly well delimited lexical group. In the violin study referred to above (Paper Two and Winter 1996), I study a more general setting, and also look at the kind of expectations embedded in the use of linguistic labels. Here the question is whether language users make use of the vocabulary that is available to them, but perhaps not to the listener, or whether they prefer multi-word expressions that they know will be understood. It turns out that the experts sometimes introduce words from the specialized violin vocabulary, but in other cases use multi-word expressions to designate the same thing. This shows a trade-off between the present interaction and possible future interactions: if only the present interaction is considered, it is often more economical to use multi-word expressions that are easy to understand, rather than a specialized word that is likely to provoke a shift to the communicative level of dealing with the meaning of the word.
In this way, the knowledge contained in the linguistic labels will sometimes be passed on to a bigger community, but sometimes it will remain isolated. The existence of a linguistic label in a certain linguistic community signals the importance of the concept that it denotes.
The knowledge lexicalized at the syntactic level is harder to analyze than the semantic and pragmatic knowledge. It is for example much more obvious that 'soon' is related to expectations than that expectations are one of the main structuring principles of modals. A correlate to this is that syntax, which is conventionalized at the highest level, will be the most "obvious" part in language. We are very unlikely to start discussing the syntactic features of language during the change of a violin string.