7.2 Complexity and economy in generative theory
Accepting such a proposal requires getting past the traditional model of the autonomous lexicon, seen as the repository for all idiosyncratic material in the language, and the unitary morpheme, naturally linked to its semantic content or lexeme. To do so, it is necessary to further examine the theoretical motivation for the notion of the "input" lexical form, modified by grammatical processes to yield varying surface forms across a range of words. The traditional motivation for proposing generative phonological effects derives from the presence of surface contrasts in such "related" words. The role of the generative process in all generative theories, including Optimality Theory, is to change a unitary, "basic" morpheme into all of its surface manifestations, on the basis of regular and predictable environmental conditioning, whether these be expressed as rules or constraints.
This need for generating multiple instantiations from a single prototype is motivated by a principle of economy, or the striving for least complexity. One major problem with this approach, however, is that no principled way to measure economy and complexity within generative theory has ever been generally accepted. The principle of economy in early generative work was focused upon generalizing the rules as far as possible, with the result that the complexity found in real linguistic data was removed from the grammar proper and encapsulated in the lexical entry. By adding complexity to the lexical entries, such as additional non-surfacing phonemes or diacritic markings, the rules themselves could be made more general, elegant and economical.
However, this economy in the formulation of rules is not identical to a general principle of economy, which might be called the economy of expression. In other words, the economy of rules is not equivalent to the economy of the grammar as a whole, which would minimize complexity throughout the system, rather than in just one of its subcomponents. The lack of a formalization for the lexicon, which tended to simply consist of a list of all "unpredictable" elements, allowed for its concomitant increased complexity to pass unregarded. The lack of an evaluation metric, and the association of the principle of economy only with streamlined rules (without regard to the resulting over-complex lexical entries) has been a consistently problematic feature of traditional generative analyses. One example of this is the case of n’ghtengale and m’ghtily noted above (¤ 1.4.1). Chomsky & Halle (1968) preferred to postulate underlying forms with a velar fricative /x/, rather than marking them as exceptional to one of their shortening rules. Lexical Phonology, on the other hand, preferred to emphasize what was regarded as a fundamental distinction between affixed ("derived") and non-affixed ("basic") words, again to avoid similar exception marking. That there is no formal way to judge the costs and benefits of either interpretation highlights the fact that there is no accepted evaluation metric in the field, only impressions and intuitions.
Lexical Phonology attempted to limit rule-based economy by introducing the counter-assumption that one should avoid "complicating" the lexicon. Instead, the rules were complicated by being made to apply less generally (¤ 1.4.2). Indeed, there are no concrete methods in derivational theories for measuring the complexity of either rules or the lexical representations. For example, one measure of lexical complexity might involve adding up the number of morphemes in the lexicon. Natural languages, however, present very large lexicons containing many different lexical items, many of which share the same meaning. In terms of its social function the existence of large lexicons appears to be encouraged rather than restricted. New words are constantly being coined, some from entirely new morphemes, and many for concepts which already have names. The lexicon does not seem to have an upper size limit, nor does possessing a large mental lexicon appear to degrade the performance of one’s language ability.
Chomsky & Halle (1968) suggest in their use of the principle of economy that the crucial element is not the number of morphemes, but the number of morphemes needed to characterize "related" words. Single morphemes are used to generate as many related forms as possible, regardless of the complexity of the rules or lexical items required. In a language ideally conforming to this pure generative model, all word-structure would be predictable, each word ultimately linked to a single identifiable morpheme. Yet all natural languages show a robust subset of "irregular" forms which specifically do not conform to this ideal, the mastery of which is essential for acceptance into the native speaking community. Sometimes, regular words can be adopted into these irregular categories, for example the dialectal verb paradigm drag Ü drug. It has also been shown, for example in the case of Maori, that grammars do not always take the most abstract, generative route available when presented with related, alternating words (Hale 1973, Kiparsky 1982d: 62-3). While the regularization of irregular forms shows that this kind of economy plays a role in natural language, it appears to be only one of many competing influences on the grammar.
Bybee (1995) criticizes the generative model, which she refers to as the "dual-processing model", for deriving regular forms by rule from base morphemes while irregular forms are simply listed in the lexicon, suggesting "two highly distinct types of processing", which she claims is inconsistent with the psychological evidence. In the generative model, this concept of a "base" form arose from the intuitively appealing idea that longer words are composed of shorter units, which are enlarged via affixation. For cases like instrument Ü instrumental Ü instrumentality or walk Ü walked Ü walking the concatenative derivational explanation happens to appear to be straightforward. However, this approach breaks down for sets like strategy Ü strategic Ü strategian Ü stratagem, imperi-ous Ü imperi-al, experi-ence Ü experi-ment, which present no clear unaffixed "base" form.116
The source of this apparent problem is the conflation of the "base" morpheme with the unaffixed prosodic word, which implies that the prosodic word is likewise a category of basic semantic meaning and morphological constituency. Delinking the structural, segmental morpheme from its prosodic manifestations essentially removes this problem. In other words, both the problem of overgeneralizing rules at the expense of lexical entries for the sake of "economy", and the problems of quantifying lexical complexity in order to alleviate it, are epiphenomena of the rule-based generative conception of phonology and morphology. The common element shared between the words above is a morpheme or string of segments (e.g., /strateg/), which should be regarded as a root-level constituent regardless of the fact that the morpheme itself never appears in an unsuffixed prosodic word. The subcategorizations for this morpheme in each prosodic word are, as in all words, idiosyncratic and recorded in the lexicon, i.e., in the "lexical part" of the constraint hierarchy.
7.2.1 Bybee’s network model of morphology
In the OT analysis of English stress and vowel alternation presented previously, the prosodic differences seen between "base" and "derived" words have been linked to a contrast in morphological structure, governed by the subcategorization behavior of affixes (¤ 4.2.2, ¤ 6.1, ¤ 7.1.1). This provides a conception of the lexical or "input" item as arranged into morphemic units which may be combined with each other and map to morphological and prosodic constituents. Yet the very concept of discrete, recombinable morphemes has by no means been accepted by all theoretical frameworks. Bybee (1995: 1) points out that linguists have argued every conceivable point of view between the spectrum of Chomsky & Halle’s (1968) completely concatenative rule-based system, wherein all words are reduced to sets of unitary morphemes, to Vennemann’s (1974) completely lexical system, in which all words, regular or not, are recorded "whole" in the lexicon. Bybee (1995) argues in favor of her own network-style system (Bybee 1985) in contrast to the "dual-processing" generative model of Pinker (1991). It will be argued here that while these approaches emphasize supposedly competing viewpoints, each represents a facet of a complex, interconnected process that necessarily contains elements of both perspectives.
Bybee claims that the lexicon does not contain so-called "base" morphemes strung together by rules, but rather has access to all surface forms, whose internal interrelations are available to the entire system. Words are thus listed as whole units, with connections characterized by lexical strength linking the phonemes common to different forms of the "same" word. This concept of lexical strength is tied to notions of frequency and the observations that "regular" patterns are common throughout the lexicon, while irregular words tend to be idiosyncratic in form but show high individual word frequencies. High frequency phoneme strings, such as affixes, will have high lexical strengths, and can thus extend their links to further words in cases of innovation or regularization. Particular paradigms, e.g., that of the verb walk, are indicated in Bybee’s network representations by links between the phonemes seen in the stem walk in all its various forms. The links between the stem phonemes will also highlight the contrast between themselves and the links between affixal phonemes in forms like walks, walked, walking, walker:
(7.6) w a l k s
/ / / /
w a l k
\ \ \ \
w a l k i n g
/ / / /
w a l k e d
This method of linking also allows Bybee to model non-concatenative relations, such as that seen in sing Ü sang Ü sung Ü song. The phonemes /s,n,g/ can be linked together in the paradigm, signifying the stem, while the vowels will connect to similar alternating vowels in ring Ü rang Ü rung and other sets. Thus, minority patterns will also be represented by lexical links, although these links will not be as strong as those connecting majority patterns. Bybee represents the strength of lexical links via the thickness of the lines used in diagrams like (7.6).
The dialectal tendency for structurally similar words to "regularize" to one of these minority patterns, such as in bring Ü brang Ü brung or drag Ü drug, can also be accounted for in this model. These patterns of regularity across the lexicon are referred to as type frequency. In many languages, there are competing groups of sub-regular patterns such as these, which sometimes increase in number and in any case remain robust. Bybee claims that the generative model is not well-equipped to account for these kinds of relationships.
There is also another sort of frequency, known as token frequency, which refers to the frequency of individual words in a language. Bybee notes that "irregular" forms tend to have a high token frequency, making up for the small number of lexical links between such forms with the strong lexical presence of the forms that do occur. High token frequency can maintain very irregular morphological relationships, such as the suppletion seen in go Ü went. Bybee suggests that the high token frequency of the past tense form went (as opposed to the present form wend) allowed it to break the weak links of its old paradigm and instead act as part of the paradigm of go, with which it has no phonemic correspondences. Both type and token frequency contribute to the lexical strengths of the connections between the forms in the lexicon and influence the construction of new entries.
7.2.2 Form over content
There is some striking evidence to illustrate that the generative principle of economy of stems is, as Bybee maintains, simply one competing tendency in the lexicon. There is a set of high-frequency English words which are evidently identified and incorporated into the grammar on the basis of their surface melodic (phonemic) form, rather than according to obvious semantic and morphological links assumed to form the basis of paradigmatic organization in both generative theory and Bybee’s system. For these words, the shape of their surface realizations requires that their corresponding morphological constituents are subcategorizing differently from the morphemes found in other words of their semantically related paradigms, although it would be more "economical" in generative terms to produce all the words in question from the same morpheme.
For example, most trisyllabic prefixed verbs whose lexical stems happen to end in phonetic [-eÆt] or [-aÆz] take on the stress pattern of proper verbs in /-ate/ and / ize/ (e.g., c—nfiscˆte, mŽsmer“ze), regardless of the fact that many of these words are neither morphologically nor historically members of those categories:
(7.7) sœperv“se sperv’sion (not *spervis‡tion, etc.)
c’rcumc“se c“rcumc’sion
rŽcogn“se rcogn’tion
tŽlev“se tlev’sion
’mprov“se (impr˜vis‡tion)
The shape of the semantically connected nouns to these verbs indicate that these words are based on morphologically (and historically) prefixed monosyllabic roots which simply happen to have the morphological structure /Ciz/ and /Cat/, i.e., sœperv“se is morphologically /super-vise/, c’rcumc“se is /circum-cise/, ’mprov“se117 is /im-pro-vise/. Most of these forms show a morphological paradigm parallel to prefixed stems like expl—de Ü expl—sion. However, for the verbal forms, the presence of a rime/ öz/at the end of the word has been enough to suggest to the grammar that these forms are suffixed in /-ize/, despite the fact that this interpretation complicates these paradigms (in generative terms) and introduces stem allomorphy. For example, a stem /circumc/ as abstracted from c’rcumcise forms no other words, and if regular would have an abstract noun *c“rcumciz‡tion. Historically a prefixed form of the root /cis/, also seen in exc’se, c’rcumcise would be stressed like c“rcumvŽnt if it were not incorrectly interpreted as suffixed in /-ize/.
Further data suggests that surface similarity is not the only criterion for such an interpretation. Simply the presence of a segment sequence /-at-/ at the end of an analyzable morpheme is apparently enough to warrant the classification of these words as suffixed in /-ate/:
(7.8) supŽrlative
rŽlative
tr‡nslator, tr‡nslate
These words are, historically, prefixed forms of a Latinate stem /lat/ (compare emo³tive, divösive, “ndecösive, spern—rmal). In these cases, the grammar has, at some point, added morphological complexity to the lexicon by introducing a novel stem which is irreconcilable with the other members of its paradigm (we expect rela³te Ü *rela³tive like emo³³te Ü emo³tive).118
That this interpretation is ahistorical suggests that a morphologically regular paradigm has in each case been changed (in generative terms) to a morphologically irregular one simply because the words in question were melodically interpretable as being part of a productive pattern of suffixation. This type of reinterpretation is not at all predicted by the generative system of constructing the lexicon according to the economy of stems principle, and although Bybee’s system can better represent the links between the phonetic realizations, it is awkward to denote the paradigmatic links in these cases, because each sequence /iz/ in forms like sœpervise would then be multiply linked (to both stems and suffixes) in the network. Only a logical separation between structural morphemes and underlying semantic constituents, and the means of gracefully accounting for such complexity through superordinate selection and subcategorization constraints, can model these interesting forms cleanly and consistently.
7.3 Morphology and the lexicon
Although in Bybee’s system there are no morphemes per se, "even though words entered in the lexicon are not broken up into their constituent morphemes, their morphological structure emerges from the connections they make with other words in the lexicon," (Bybee 1995: 5-6) and Bybee speaks of "parallel sets of phonological and semantic connections." Bybee’s model is not incompatible with that offered thus far in the OT analysis presented here, but differs in that it seems to require an overt listing of each (prosodically defined) word in the lexicon. This is the result of a theoretical conflation similar to that seen in generative theory, where the prosodic and morphological words were conflated. Bybee is instead conflating the semantic denotation of a given word with its morphological correspondent, through the medium of the prosodic word as the minimal lexical unit.
It is possible to propose, within the context of OT, a third, semantic hierarchy, which maps to the prosodic and morphological constituent hierarchies, and which would be subject to constraints governing the lexical selection of morphemes, the correspondence between morphemes and semantic structures.119 Bybee’s approach lists each (prosodically defined) word as a single entry with both morphological and prosodic components, linked to other such words according to prosodic, morphophonemic and semantic criteria. While this system is capable of modeling many of the morphological patterns and regularities seen in natural language, it fails to capture all the relationships expressible via configuring three separate hierarchies, prosodic, morphological and semantic, using Optimality Theory. By formally delinking the independent concepts contained in these hierarchies from representations that typically conflate them, a more comprehensive, expressive system can be arrived at.
A similar conflation of structures is found at the basis of the criticisms of the traditional lexicon offered by Golston (1995, 1996). Golston claims that OT and its generative ancestor theories make a fundamental claim about prosodification, referred to as the "Theorem of Impossibility" (Golston 1995: 1):
(7.9) Every underlying form is an impossible surface form and vice versa.
What Golston means is that since "underlying" or "input" forms are prosodified during evaluation (or derivation), they must not be prosodified in the lexicon and are thus in themselves impossible surface forms. Golston (1996: 1) then takes this a step further to claim that this means "speakers cannot store things that they can say and vice versa." Golston’s assumption here expresses a misconception about underlying representations arising from the "input/output" model inherited from generative theory. When regarding the underlying form as the closest possible copy of the surface form (following the usual conceptions of Lexical Phonology and reflected in OT by the concept of Faithfulness), it is simple to regard the "stored" underlying representation as simply a (pronounceable) variant of the surface word. The problem of conceptualizing an underlying form lacking prosody lies behind theories, such as those of Vennemann (1974) or Bybee (1985, 1995), which demand that the lexicon store fully prosodified words.
Rather than being viewed as the stored form of a word, the traditional "input" form is understood here as a set of morphemic constituents lexically selected by their corresponding semantic constituents, which comprise the "true" input form. Prosodification of these morphemes involves the selection of various prosodic constituents from a set of prosodic structures, which interact with the morphological constituents and allow for the expression of the features in the morphemic constituents into acoustic output, or speech. As spoken natural language is the apparently unitary product of this triple linking, it is difficult to conceptualize or represent the separate elements of this process outside the realm of abstract theorizing. When Golston claims that the existence of a completely independent, unprosodified morphological constituent requires that "speakers cannot store things they say", he is crucially assuming that this "storage" is equivalent in full to the morphological component, and that the form the stored item should take must be "pronounceable". Under the approach set out here, every word is understood as having associated with it a set of prosodic, morphological and semantic constituents, representing a decomposition of the associated surface form into three types of structures. None of the individual constituents in the three hierarchies are independently utterable; requiring them to be so is similar to asking someone to utter a mora, a foot or a stem. It is only in the context of all three hierarchies that an utterance is possible.120 The constraint hierarchy, and thus the grammar, is the formal description of the interaction of these three hierarchies, yielding the attested language (and thus its surface forms).
The approach to morphological selection taken here continues this line of reasoning: the lexicon, which may be conceptualized (following Bybee) as a network relating semantic, morphological and prosodic constituents, is argued to contain all information necessary to describe the correspondences between sound and meaning in the language. However, while Bybee’s approach attempted the same thing by overtly listing every word in the language as an independent quasi-prosodic/semantic/morphological unit, this is not necessary here, although the same desired effect, the recording of all relevant relationships in the lexicon, results. Instead, every component of the language, i.e., the different constituent members of the three hierarchies, appear as arguments to constraints in the hierarchy. Correspondences between constituents play a similar role as links in Bybee’s networks, but the ranking and violability of constraints allows for constituents to be understood as discrete while surfacing in a variety of ways (e.g., coalescence, templatic morphemes, alternations). Instead of simply representing links between segmentally similar prosodified surface words, the constraints in question represent not only the connections between semantic constituents and their selected morphemes, but also the subcategorization relationships seen between morphemes, and the relationship between morphemes and prosodic constituents.
In Bybee’s system, the strength of the links between words corresponded to issues of frequency. Here, frequency also plays a similar role, but this is represented in terms of constraint ranking rather than simply indicating which relationships are more frequent. Very common prosodic patterns, for example, will be governed by a series of prosodic constraints. The crucial ranking of such constraints above others, which would yield a different result under another ranking, is determined by the patterns most frequently encountered in the language. The advantage of the OT approach proposed here is that such a constraint hierarchy covers the entire language, and affects every form, while Bybee’s links are purely local and describe only the properties of some group of words, apparently unrelated to others. The constraint hierarchy accounts for the entire range of constraints, from those which affect all words to those which affect only a few, by ranking and violability. Conversely, words with idiosyncratic behaviors may require (for example) selectional constraints that are more highly ranked than the general constraint which would normally disallow the form. The high ranking of such a constraint is again linked to frequency, and it is precisely such idiosyncratic forms which are expected to have high token frequencies, represented here by a high-ranking selectional constraint.
Both Bybee’s network approach and the OT solution offered here are able to represent the fact that within a language, while there are large sets of data which conform to broad, regular patterns, there are also smaller sets that conform to minor patterns, and other sets or items that directly contravene the general trend. In a suitably explanatory linguistic theory, the grammar (and its associated lexicon) must be able to robustly account for all these types. Bybee’s system arbitrarily uses the (prosodically defined) linguistic word as the primitive category and only allows indirect access to morphemes; semantic content is linked to segmental identity and it is difficult to see how meaningful prosodic constituents would be represented, divorced from their segmental contexts. The OT solution allows for the theoretical separation of the linguistic word into three constituent hierarchies, whose members’ relationships define the entire grammar, and which expresses, through the constraint hierarchy, simultaneous relationships between every constituent in the language rather than just local relationships between small groups of forms. Bybee’s networks are a useful way of visualizing some of the relationships between words which must be captured by an explanatory grammatical theory, but the OT constraint hierarchy can capture both the various morphological relationships modeled in Bybee’s networks and the complete range of inter-constituent relationships, from the very general (e.g., stress and syllabification) to the very particular (e.g., allomorphy, idiosyncratic subcategorization), which cannot be modeled simply by connections between structurally "related" words in a network.
Dostları ilə paylaş: |