Chapter. 1 Introduction



Yüklə 1,52 Mb.
səhifə2/21
tarix07.08.2018
ölçüsü1,52 Mb.
#68537
1   2   3   4   5   6   7   8   9   ...   21

Table.1: An Eight Token Kashmiri Sentence in SSF


1.NA 2.WF 3.BF 4.POS 5.Morph 6. 7. dRel 8.dFn 9. 10.
1 سفید سفید JJ JJ.0.0.0 _ 2 Adj _ _
2 پَلو پَلو NN NN.0.0.0 _ 3 Obj _ _

3 ٲس ٲس VA VM.0.0.0 _ 0 Root _ _


4 آسَان آسَان VM VA.0.0.0 _ 3 Aux _ _
5 حضُو‘رَن حضُو‘ر NNP NNP.0.0.0 _ 3 Subj _ _
6 سٮ۪ٹھا سٮ۪ٹھا INT INT.0.0 _ 7 Intf _ _
7 پَسنٛد پَسنٛد NN NN.0.0.0 _ 3 pRoot _ _
8 ۔ ۔ SYM SYM.0 _ _ _

Table.2: An Eight Token Kashmiri Sentence in CONLL-X Format



    1. Choice of Annotation Interface

The annotation process for the development of treebank can’t be accomplished effectively unless some user friendly annotation interface is available. The annotation interface is generally customized on the basis of requirements of the annotation scheme. Given the specifications of the treebank to be built, one can search for some open-source tools instead of wasting resources to develop new tools. In fact, there are many open-source syntactic and syntacto-semantic annotation tools available which have been developed under various research projects throughout the world. Such tools include:

  1. Dependency Grammar Annotator (DGA)

This tool has been developed to facilitate the syntactic annotation of text-corpus within the formal framework of Dependency Grammar (Tesnière, 1959). DGA is a user friendly graphical interface which allows the efficient creation and manipulation of syntactic structures. DGA was developed by Marius Propescu of the University of Bucharest, under the BALRIC-LING project.

  1. Syntactic Tree Viewer

It is an easy to use interface to visualize or create simple linguistic trees. It allows creating and editing of syntactic and viewing the output in string format. It supports for visualizing parse trees produced by various parsers including Stanford parser and Charnaik parser. It also support for visualizing Penn Treebank trees with slight modification.

  1. Sanchay

It is an open-source platform to carry out various NLP tasks for South Asian Languages (SALs). It has been extensively used for Indian Languages (ILs) at various NLP research labs especially at LTRC Lab., for various research projects like ILMT8, Treebank projects (Hindi, Urdu, Telugu & Bengla), PropBank9 & Bengali Treebank10. So the tool has been very instrumental in creating languages resources & carrying out various NLP tasks for ILs. However, it is generally assumed that Sanchay is exclusively devised to implement PCG but the fact is that it can be customized & used irrespective of grammatical frameworks & formalisms but it is also true that Panini’s PCG was first experimented & implemented in Sanchay for ILs under ILMT project.

  1. Cornerstone

It is a PropBank frameset editor developed at the University of Colorado at Boulder. It runs platform independently and supports multiple languages such as Arabic, Chinese, English, Hindi and Korean. It is worth mentioning that before development of cornerstone, Sanchay was used for PropBank (Palmer et al., 2005) for annotating predicate argument structure. However, it is not sufficient for treebanking where one also needs to annotate beyond predicate argument structure like the coordinated and embedded clause constructions, sentential modifiers, internal structure of complex predicates, serial verb constructions and subject & object complement constructions.

Besides above mentioned tools, there is also GATE Architecture which can be customized and used for syntactic annotation and other NLP tasks also. Finally, it is worth to mention that if the demands of annotation schemes are not fulfilled by such open source tools (as in case of Corner Stone), even after their customization, new annotation tools can be developed provided funding and technical support is available. Usually, annotation schemes are never alien and are developed in consonance with the pre-existing schemes and Tools. So the second situation barely arises and it is hardly necessary to strive for building new annotation tools.



  1. Theoretical Preliminaries

Lucien Tesniére (1930s), a French Linguist, developed a relatively formal and sophisticated theory of DG grammar, Éléments de syntaxe structural, for pedagogic purposes. It was first drafted in 1939 but published latter on in 1959, posthumously. Tesniére puts forward his notion of dependency in the following lines:

“[I] La phrase est un ensemble organisé dont les éléments constituants sont les mots. [II] Tout mot qui fait partie d’une phrase cesse par luimeˆme d’eˆtre isolé comme dans le dictionnaire. Entre lui et ses voisins, l’esprit apercoit des connexions, dont l’ensemble forme la charpente de la phrase. [III] Les connexions structurales établissent entreles mots des rapports de dépendance. Chaque connexion unit en principe un terme supérieur a`un terme inférieur. [IV] Le terme supérieur recoit le nom de régissant. Le terme inférieur recoit le nom de subordonné. Ainsi dans la phrase Alfred parle [. . .], parle est le régissant et Alfred le subordonné.” (Tesniére, 1959, p.11-13)

“[I] The sentence is an organized whole, the constituent elements of which are words. [II] Every word that belongs to a sentence ceases by itself to be isolated as in the dictionary. Between the word and its neighbors, the mind perceives connections, the totality of which forms the structure of the sentence. [III] The structural connections establish dependency relations between the words. Each connection in principle unites a superior term and an inferior term. [IV] The superior term receives the name governor. The inferior term receives the name subordinate. Thus, in the sentence Alfered parle11 [. . .], parle is the governor and Alfred the subordinate.”12 The ‘parle’ is also the root (the head of whole clause, Alfered parle) of the structural diagram (dependency graph) called ‘stemma’ which is widely used in different formalisms of dependency framework.

Dependency relations belong to structural order which is different from linear order of spoken or written string of the words and structural syntax (Nivre, 2009). Dependency relation holds between Head (H) and Dependent (D) in a clause or sentence which is represented by a labeled arch13 (arrow), projecting from H to Ds. Therefore, the criteria for establishing dependency relations and for distinguishing between the H and D are of paramount importance, not only in dependency framework, but also within other frameworks where the notion of syntactic head plays a pivotal role, including all constituency based frameworks that belong to some version of X-bar theory (Chomsky, 1970; Jakendoff, 1977). Zwicky (1985) has proposed some of the following criteria to distinguish between an H and a D in a construction (C)14.



  1. H determines the semantic category of C, D gives semantic specification.

  2. H determines the syntactic category of C and can often substitute C.

  3. H is obligatory, D is optional.

  4. H selects D and determines whether D is obligatory or optional.

  5. The form of D depends on H (government or agreement/concord).

  6. The linear position of D is specified with reference to H.

It is very important to distinguish between syntactic dependencies in endocentric and exocentric constructions (Bloomfield, 1933). For illustration, consider the structure of the following sentence, taken from the Wall Street Journal part of Penn Treebank:

Figure.4: Dependency structure for English sentence15


The Attribute (ATT) relation holding between ‘H’ (noun “markets”) and ‘D’ (adjective “financial”) is an endocentric construction in which head can substitute the entire group of words “financial markets” (phrase or chunk), without impacting the overall syntactic structure of the sentence. The endocentric constructions generally satisfy all the above criteria. However, aforementioned criterion (IV) is usually considered less relevant as dependents are always optional in such constructions.

While as the Prepositional Compliment (PC) relation holding between H (preposition “on”) and the dependent (noun “markets”) is an exocentric construction in which head can’t substitute the entire phrase (on financial markets). Such constructions fail to meet mentioned criterion (I), at least, with respect to the substitutability of the head for the whole construction (phrase or chunk) but they may satisfy rest of the criteria. Further, the subject (SBJ) and object (OBJ) relations are clearly exocentric while the remaining ATT relations (effect →little, effect →on) have a more unclear status.

The contrast between Endocentric and Exocentric constructions is also related to the contrast between head-complement and head-adjunct relations. The former relations (preposition-noun) are exocentric while as the latter relations (adjective-noun) are endocentric. The third one, head-specifier relation (determiner-noun) is also an exocentric relation like the head-complementation but there is no clear selection of the dependent element by the head. The contrast between complements and adjuncts (modifiers) is often defined in terms of valency which is the central notion in the theoretical tradition of the dependency grammar. The notion of valency has been originally taken from chemistry. It is usually related to the argument structure16. The idea is that the verb ‘H’ imposes certain requirements on its syntactic dependents that reflect its interpretations as a semantic predicate. The nouns (Ds) which are the arguments of a predicate (can be obligatory or optional in surface syntax) can only occur once with each predicate. While as the ‘Ds’ which are adjuncts (tend to be optional) can occur more than once with a single predicate. The valency frame of the verb (predicate) is generally considered to include ‘Ds’ which are arguments not Adjuncts. Therefore in the Figure.1, the (SBJ) “news” and (OBJ) “effect” would be generally considered as valance-bound Ds’ of the ‘H’ “had” while adjectival modifiers of the ‘Hs’ “news” (economic) and “markets” (financial) would be considered as valance-free Ds’.

While head-complement and head-modifier structures have a fairly straight forward analysis in dependency grammar, there are also many constructions that have a relatively unclear status. This group includes constructions that involve function words, such as articles, complementizers and auxiliary verbs, and apart from that, structures involving prepositional phrases. For these constructions, there is no general consensus in the tradition of dependency grammar as to whether they should be analyzed as head-dependent relations at all and if so, what should be regarded as the head and what should be regarded as the dependent. For example, some theories regard auxiliary verbs as heads taking lexical verbs as dependents; other theories make the opposite assumption; and yet other theories assume that verb chains are connected by relations that are not dependencies in the usual sense. Another kind of construction that is problematic for dependency grammar (as for most theoretical traditions) is coordination. According to Bloomfield (1933), coordination is an endocentric construction, since it contains not only one but several heads that can replace the whole construction syntactically. However, this characterization raises the question that whether coordination can be analyzed in terms of binary asymmetrical relations holding between a head and a dependent.



  1. Chapterization

This dissertation consists of following seven chapters:

Chapter.2: Review of Existing Literature

The chapter surveys the existing literature on various grammar formalisms and treebanking. It presents a historical view on dependency parsing, tracing its roots in Indian, Semitic and Hellenic traditions. Some brief history of treebanking is also traced out. It attempts a link between these old grammatical traditions and the contemporary practice of natural language parsing & treebanking.



Chapter.3: Developing KashCorpus

The chapter begins with the introduction of philosophical grounds that underlie the current corpus based research. It gives a brief account of language and other computational resources that have been developed for Kashmiri. Finally, the chapter investigates the problems of KashCorpus collection, development, sanitization & normalization.



Chapter.4: POS Tagging of KashCorpus

The chapter discusses the building of the fundamental layer of annotation for the dependency treebank of Kashmiri, i.e. parts-of-speech tagging of the selected portion of Kashmiri corpus. Further, a brief review of various POS tagging frameworks and tagsets that have been developed for English and Indian Languages is given. The various issues that have been encountered in the annotation process and the empirical results are presented in this chapter.



Chapter.5: Chunking of KashCorpus

The chapter discusses the second layer of annotation for building the dependency treebank of Kashmiri, i.e. the chunking of POS annotated KashCorpus. It presents the detailed description of various chunks found in Kashmiri. It further gives a detailed account of various issues and also presents the empirical results.



Chapter.6: Syntactic parsing of KashCorpus

The chapter discusses dependency annotation of the chunked KashCorpus in detail and presents a detailed account of the dependency treebank of Kashmiri (KashTreeBank). Further, the language related issues which have been raised during the annotation process are also discussed. Finally, the results of inter-annotator agreement are also presented.



Chapter-7: Conclusion

It presents a conclusion of all the research presented in this dissertation.




Chapter.2 Review of Existing Literature

`Would you tell me, please, which way I ought to go from here?'

`That depends a good deal on where you want to get to,' said the Cat.

`I don't much care where.' said Alice.

`Then it doesn't matter which way you go,' said the Cat.

`So long as I get somewhere,' Alice added as an explanation.

`Oh, you're sure to do that,' said the Cat,

`If you only walk long enough.'

Carroll, 2003


  1. Introduction

This chapter surveys the existing literature regarding grammar formalisms, dependency parsing and tree-banking. The chapter is organized in nine sections. Section two presents various relational structure (dependency) based grammar formalisms for treebanking. Section three debates on various modifications in the notion of VP to account for the non-configurationality and to justify the use of dependency based formalisms. Section four tries to view dependency grammar from the historical perspective, tracing its roots in ancient & medieval times. Section five presents the rationale for using DG. Section six describes the notion of treebanking. Section seven presents the principles involved in treebanking. Section eight gives a brief account of some dependency treebanks. Finally, section nine summarizes the chapter.

  1. Grammar Formalisms

There is very close relationship between grammar formalism, syntactic parsing, syntactic annotation and treebanking. In fact, treebank is a product of syntactic parsing and annotation of natural language corpus, based on a given grammar formalism or simply grammatical model. The syntactic annotation for building a treebank can be carried out either manually, automatically or semi-automatically. The term 'Parsing' has been derived from Latin phrase Paras Orationis meaning “Parts-of-Speech”. The term refers to both the synthetic (bottom-up)17 as well as the analytical (top-down)18 approaches of inquiry into the natural language syntax. In CL and NLP literature, the former is commonly known as dependency based parsing (DBP), which addresses the following research questions: a. How do words combine to form sentences? b. How does bottom-up approach to parsing help in understanding the nature of language? c. How does bottom-up approach facilitate the annotation and capturing of the grammatical knowledge and ensure its role in developing real world computational tools and applications? The latter approach is known as constituency based parsing (CBP), which also addresses the similar research questions, like: a. How is a sentence broken into smaller units like clauses, phrasal nodes and then into the terminal nodes (words)? b. How does top-down approach of analysis help in understanding the nature of language and how does it ensure its role in developing real world computational tools and applications? Both these approaches include some notion of relational structure but it is described in different ways (C. Bosco & V. Lombardo, 2004). Since, the notions of dependency or relational structure are used in the current work; constituency based formalisms such as PSG, GB & Minimalism are not dealt with here. However, the notions of grammatical relations and predicate argument structure are given a proper treatment.

There are several approaches in the literature to explain the grammatical relationships (GRs) in a clause. These approaches posit GRs as semantic roles which include Verb-specific roles e.g. Runner, Killer, and Bearer; Thematic Roles e.g. Agent, Patient, Theme, Instrument and Experiencer, and Generalized Roles like Actor and Undergoer (Dowty, 1982; Van Valin, 1999). Marantz (1984) describes that GRs are the syntactic counterparts of certain Logico-semantic relations such as the predicate-subject and modifier-modifiee relations. Rappaport & Levin (1988) describe GRs in terms of purely syntactic relations (SUBJ, DOBJ, and IOBJ) and thematic roles. However, the status of thematic roles (as purely semantic or syntacto-semantic) and the identification of an appropriate inventory of semantic GRs are not very clear (Leech et al., 1996). It gets more complicated when we see that purely syntactic relations may bear thematic roles. For instance, in the sentence “the garden is swarming with vipers,” the subject coincides with a thematic role-locative instead of the more expected agent relation (Renzi, 1988). There are no one-to-one clear cut correspondences between syntactic relations and semantic roles and most theories of grammatical relations make distinction between purely syntactic relations and semantic roles. The distinction between syntactic and semantic relations with some independence from morphology is not new and can be traced back to Panini’s Karaka theory. The six Karakas are semantic relations (agent, object, instrument, destination, source and locus) which are assigned to the nouns governed by a Verb. However, an inventory of universally accepted semantic relations, also known as thematic or theta-roles ceases to exist.




    1. Dependency Grammar (DG)

In contrast to the constituency, dependency is a vertical organizational principle that shows binary asymmetrical relation between a head and its dependents19 (Kruijff, 2002). The basic idea of dependency grammar is that the syntactic structure is a flat (with no non-terminals) and rooted structure called Stemma which consists of lexical elements linked by binary asymmetrical relations called dependencies. The variants of DG which are briefly reviewed here are Structural Syntax (SS), Functional Dependency Grammar (FDG), Word Grammar (WG), Meaning Text Theory (MTT) and Paninian Computational Grammar (PCG). These variants share the major tenets of dependency and proposed relation-based structures for language representation.

  1. Structural Syntax (Tesnière, 1959)

It adheres to the long standing notion that syntax is a matter of combinatory requirements or capabilities of words (i.e. their valency). The fundamental syntactic building block of the sentence is considered to be a word (token) which is linked to other words (directly or indirectly) by means of the dependency relations (for details see Chapter 1).

The main idea behind Tesnière’s model is the notion of dependency which identifies the syntactic relation existing between two elements within a sentence, one of them taking the role of governor (or head) and the other of dependent (régissant and subordonné in the original terminology). He schematizes this syntactic relation using a tree diagram called Stemma.

In his scheme all words are divided into two classes: full content words (e.g., nouns, verbs, adjectives, etc), and empty functional words (e.g. determiners, prepositions, etc). Each full word forms a block which may additionally include one or more empty words and it is on blocks that operations are applied. He distinguishes four block categories (or functional labels); nouns, adjectives, verbs and adverbs. Also, a distinction is made between Actants and Circumstants. The verb represents the process or state expressed by the clause and all its actants (representing the participants) are determined by the valence of the verb and have the functional labels of nouns. On the other hand, the verb’s Circumstants (representing the circumstances under which the process is taking place, i.e. time, manner, location, etc) have the functional labels of adverbs. There are two operations, Junction and Transference, by means of which it is possible to construct more complex clauses from simple ones. The junction is employed to group blocks which are at the same level, i.e. Conjuncts, into a unified entity by itself attaining the status of a block. The conjuncts belong to the same category and are horizontally connected (and not always) by means of empty words called the conjunctions. There are two types of transference operations. The first degree transference is a changing process which makes a block to change its original category. This process occurs by means of one or more empty words belonging to the same block called transferrers. For instance, the category of word ‘rotten’ in the construction “rotten food” is transferred from noun to the functional label of an adjective through transferrer, the perfective participle -en. The second degree transference occurs when a simple clause becomes an actant or a circumstant of another clause, maintaining all its previous lower connections, but changing its functional label within the main clause.

For example:



  1. She believes that he knows it.

  2. The man I saw yesterday is here today.

  3. You will see him when he comes.

In the sentence 1, we have a verb-noun transference by means of the transferrer ‘that.’ The embedded clause in italics takes the functional label of a noun and becomes the object of the verb. The embedded clause in the sentence 2 is a verb-adjective transference without any transferrer. The temporal clause in the sentence 3 is an example of verb-adverb transference where the transferrer is ‘when.’

Actants (arguments) are immediately dominated by the verb and represent the entities involved in the event, described by the verb (obligatory to fill the valence frame of the verb). The Circumstants (adjuncts) instead express the bystander’s role in the event (optional). The first actant corresponds to the Arg-1 (SUBJ), the second to the Arg-2 (DOBJ), the third to the Arg-3 (IOBJ), as in RG. In SS, the verbal valency also motivates this sorting of actants. The first actant can be found in mono/bi/tri-valent verbal nodes (that can take one, two or three actants), the second only in the bivalent nodes (that can take two or three actants) and the third only in the trivalent nodes (that can take three actants). Dependency relations are annotated to make the function of the nodes explicit. The words of a sentence together with their dependency relations form the dependency graph in which the information regarding the dependency structure is explicit while as the information regarding the constituent structure is implicit e.g. a node X with the sub-tree attached to it can represent the constituent headed by X (X-phrase) and can express all the important properties of the constituent. Therefore, a sentence structure can be described as consisting of structural nodes organized hierarchically by the nodal functions and held together by structural connections. A structural node is a group of words consisting of only one head and one or more sub-ordinate words. It is this head of the structural node which carries out the nodal function. The structure and the meaning of the sentence are theoretically independent but parallel as the structural connections match with the semantic connections to negotiate the meaning. In fact, a structural connection is usually motivated by a semantic connection, i.e. two words are linked by a structural connection in order to make their semantic connection explicit. Just as the head of the structural node bears the nodal function, the head of the semantic node bears the semantic function.

  1. Functional DG (Tapanainen & Jarvinen 1997, Tapanainen 1999)

It is a computational implementation of Tesniere (1959), describing the Structural Syntax (SS) through formal rules. FDG posits that the basic elements of the syntactic structure are the Nuclei which have mutual connections and every Nucleus has only one head. The relationship between structure (i.e. syntax) and semantics is evident in the notion of Nucleus20 which encompasses both the structural and the semantic node. Since, there is a close parallelism between syntax and semantics, i.e. the syntactic structure depends on the semantic interpretation rather than on word order or morphological marking, the variation in word order does not affect the structural analysis of the sentence. The basic element of FDG is the nucleus which consists of tokens which are words or parts-of-words of the input sentence. Here, a distinction is made between valency functions which are unique in the Nucleus (actants) and ambiguous functions (circumstants).

  1. Word Grammar (Hudson 1990, Hudson 1984)

WG primarily, developed for English, is a monostratal, non-transformational approach which uses word-to-word dependencies to show grammatical relations/functions by explicit labels, e.g. SUBJ and OBJ. It includes two main inheritance hierarchies: the system of word-classes, which also includes all lexemes and inflections, and the system of dependency types or grammatical functions. WG presents language as a network of knowledge, where all the areas of knowledge are included with no clear cut boundaries between the ‘internal’ and ‘external’ facts about words.

  1. Meaning To Text Theory (Mel’cuk 1988)

MTT is primarily, developed for Russian. It provides a rich representation and analysis of a variety of aspects of language. The natural language is posited as a logical device that establishes correspondences between the infinite set of possible meaning and the infinite set of possible texts. The representation of the sentence consists of various separate components, in particular, a semantic component and a deep-syntactic component. By performing several operations, the semantic component establishes the correspondence between a sentence and all its synonym sentences. The deep-syntactic component establishes the correspondence between the various syntactic realizations of a sentence.

  1. Paninian Computational Grammar (Bharati et al., 1993)

It has been used for syntactic annotation in the current work which is actually a variant of dependency grammar (Kiparsky & Staal, 1969; Shastri, 1973). This model helps to capture the syntacto-semantic relations in a sentence. Sentence is considered as a series of modifier-modified relations with a primary modified, root of the dependency tree, the main verb. The elements which modify the verb are its arguments that participate in the action specified by the verb. The relations of these participants (arguments) with the verb are called Karakas.

    1. Relational Grammar (RG)

RG is basically motivated by the idea that positing grammatical relations in terms of linear constituent order and their domination by the VP node, seems to be inadequate for VSO languages like Welsh in which there can be no VP node (Perlmutter, 1983) and for free word-order languages like Czech (Dowty, 1982).

RG is primarily concerned with capturing the pure grammatical relations that constitute predicate argument structure (syntactic) and other relations (semantic) that are not related to the core arguments (Perlmutter, 1980). The former includes the three pure grammatical relations like S (subject), DO (direct object and IO (indirect object) or 1, 2, 3, respectively. The numbers (1, 2 & 3)21 are assigned to posit a hierarchical organization which is motivated by the behavioral properties of the relations (head vs. dependent). The latter includes a set of impure grammatical relations (oblique objects -OO) that have independent semantic content like Instrumental, Locative, Benefective, etc. The NPs which are labeled with pure grammatical relations are called Terms while as the other NPs/PPs which are labeled with impure grammatical relations (i.e. with their semantic functions) are called Non-Terms. Figure.1 is a Relational Network which posits the relational structure of a clause as an abstract universal representation that remains constant in spite of the cross linguistic morpho-syntactic variations i.e. a clause in another languages, involving the same predicate and the same participants, will be represented by the same relational network.



Figure 1: The Relational Network Showing Dative Shift


RG assumes a universal mapping between thematic and grammatical relations known as Universal Alignment Hypothesis (UAH): the agent maps on Argument1 (John), the patient or theme maps on Argument 2 (the book), and the recipient on Argument 3 (Mary) but the surface form of the clause does not always correspond to the UAH, for instance, Passive and Dative-Shift constructions. In these cases several syntactic layers (Strata) are proposed and the surface syntactic form of the clause is derived through a series of transformations that generate a syntactic form consistent with UAH.

Figure 2: The Relational Network of Passive Construction


The initial stratum in the Figure.2 represents the underlying syntactic structure, which corresponds exactly to the active form of the clause (John eats the apple) where the UAH holds because the Agent is 1 (John) and the Theme is 2 (the apple). But when the passive rule has been applied, there is a transformation which produces a second stratum where the initial 1 loses its syntactic role, chomeur22 and the initial 2 becomes 1 (2-to-1 advancement). Being the semantic relations unchanged from the initial stratum, the agent is mapped in this last stratum to a chomeur, whilst the theme is mapped to 1, so contrasting with the UAH. In the representation of dative shift (see Figure.1), instead, comparing the initial stratum with the final one, we can observe a phenomenon referred as 3-to-2 advancement. The recipient which corresponds in the initial stratum with 3 becomes 2 in the final, consequently, 2 loses its role and becomes a chomeur.

    1. Lexical Functional Grammar (LFG)

LFG posits a flat (VP-less) structure for many VSO languages (Kroeger, 1993) and “free-word order” or “non-configurational” languages like Warlpiri (Simpson 1991). In LFG, the grammatical relations are termed as functions (Bresnan, 1982, Bresnan and Kaplan, 1982). Since, LFG does not adhere to the notion of an underlying abstract syntactic representation and transformational rules; it posits a representation where the lexicon plays a key role. It postulates three distinct but interrelated levels of grammar which co-occur in a single representation; Lexical Structure (LS), Functional Structure (FS) and Constituency Structure (CS). The LS captures the information about the meaning of the lexical items, semantic roles, constituting predicate argument structure and the grammatical functions like Subject (SUBJ) and Object (OBJ) that are associated with the arguments through the Lexical Assignment (LA). The LA states that each argument is assigned with a unique grammatical function (GF). GFs are assigned at the Lexicon-Syntax interface. For instance, the transitive verb (kick) has a predicate argument structure that consists of an Agent associated with SUBJ-function and a Theme associated with OBJ-function. The other levels of the representation, as shown in the Figure 3, are called f-structure and c-structure, respectively. Constituency relations vary cross-linguistically and across constructions within a single language while the syntactic functions are universal (invariant) and are represented in a universal format. Therefore, a number of different c-structures can have a single f-structure and it would be possible to derive an f-structure from a c-structure but not vice versa. The inventory of grammatical functions is different from that of RG. LFG distinguishes between sub-categorizable functions (governable), which can be part of a verb sub-categorization, like SUBJ, OBJ1 (direct object), OBJ2 (indirect object), OBL (oblique) and POSS (possessor) and non-sub-categorizable (non-governable) functions like AJT (adjunct), syntactic FOCUS and TOPIC. Among sub-categorizable functions, SUBJ, OBJ1 and OBJ2 are semantically unrestricted, i.e. they can bear a variety of semantic functions while as OBL and POSS are semantically restricted and can bear only some particular semantic function. The non-sub-categorizable functions are used to refer to adjuncts, to the discourse functions indicating an entity that has already been established in the discourse context (topic), or to the information about some topical participant that is new in the context (focus).


Figure. 3: The f-structure of the sentence “A boy handed the teacher a gift.”
LFG represents the f-structure of the sentence in terms of an attribute-value matrix (as shown in Figure 3) and the c-structure as an augmented constituency tree. The relationship between c-structure and f-structure is represented by adding functional information on the tree edges.

The notion of grammatical relation occupies a central role in LFG in determining which of the arguments is semantically selected by a predicate or syntactically realized and how? In particular, the lexical level plays a central role by the mechanism of sub-categorization. The similarities between LFG and RG can consist of the inventory of basic syntactic relations, the relevance of the semantic level of the sentence and use of a form of relational structure where grammatical relations are the interface between the syntactic and semantic level of the sentence. There are many differences between these approaches: LFG is mono-stratal, non-transformational and gives a central role to lexical sub-categorization. Moreover, LFG clearly represents the syntactic interrelation between relational structure and constituent structure, by assigning a c-structure to the sentence and assuming that there is some f-structure associated with each node in the c-structure.



  1. Yüklə 1,52 Mb.

    Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   ...   21




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin