Chapter. 1 Introduction



Yüklə 1,52 Mb.
səhifə16/21
tarix07.08.2018
ölçüsü1,52 Mb.
#68537
1   ...   13   14   15   16   17   18   19   20   21

Statistical Results

As mentioned in the chapter five, the three datasets that have been used for the current task consist of 682 POS annotated sentences of varying lengths, taken from three different text domains, i.e. newspaper editorials (ASL = 25 Ws or 15 Cs), short-stories (ASL = 11 Ws or 8 Cs) and critical discourse (ASL = 16 Ws or 10Cs), are partially parsed into 8125 chunks. The task with which this chapter is concerned is to deduce, annotate and find score for each inter-chunk GR, holding among 8125 chunks in 682 structures. In aggregate, 4287 GRs have been found holding under 682 dependency structures among 8125 chunks. The 4287 GRs are further classified under 25 labels, each with its frequency count in three different domains and also in aggregate, as shown in the Table.2. However, the score for each attachment label is given in underspecified manner, i.e. no separate frequency score of the variants is given.





Label

Variants

f1

f2

f3

fx

1

k1

pk1, jk1, mk1

294

80

230

604

2

k1s

 **

49

14

64

127

3

k2

k2g, k2p

213

49

262

524

4

k2s

 **

7

2

16

25

5

Rs

rs-k1, rs-k2

3

1

17

21

6

k3

 **

31

3

24

58

7

k4

k4a, k4v

55

2

102

159

8

k5

k5prk

6

0

6

12

9

k7

k7t, k7p

166

45

120

331

10

r6

r6k1, r6k2

93

55

151

299

11

Rd

 **

23

0

3

26

12

Rh

 **

10

1

16

27

13

Rt

 **

9

8

18

35

14

k*u

k1u, k2u

4

0

0

5

15

Ras

ras-k1, ras-k2, ras-neg

6

4

5

15

16

Rsp

 **

6

8

5

19

17

Rad

 **

8

0

0

8

18

Adv

sent-adv

134

9

68

211

19

Nmod

 **

14

7

31

52

20

Vmod

vmod_Rh, vmod_Inst

78

7

66

151

21

*mod_Relc

nmod_Relc, jjmod_Relc, rbmod_Relc

27

4

25

56

22

Ccof

 **

334

76

455

865

23

Pof

 **

134

49

126

309

24

Fragof

 **

113

33

203

349

25

Enm

 **

0

0

0

0




 

 Total



1817

457

2013

4287

Table.2. Showing Frequency Distribution of GRs
The empirical facts given in pie chart in the Fig.24 reveal that ccof is the most frequent GR which covers 20% of the total GRs. Therefore, co-ordination and the sub-ordination form the bulk of grammatical operations occurring Kashmiri text. Fragof constitutes 8% of the total GRs found in Kashmiri, indicating the strength of V2-phenomenon. Pof constitutes 7% of total GRs showing the significant occurrence of complex predicates in Kashmiri. Similarly, k1 constitutes 14% and k2 constitutes 12% of the relational bulk of Kashmiri text, indicating that SUBs and OBJs constitute 26% GRs in aggregate which is quite significant. It is interesting to see that quantitatively, k1, k2, ccof and fragof together cover more than half of the total relational bulk. These facts further reveal that 39-40% GRs in Kashmiri are karakas and rest, about 60%, are non karakas and 65% of GRs are dependency relations while as 35% of the relations are non-dependencies in which 6% are non-rooted dependencies, i.e. the attachments are made with non-root heads (in genitive, participial and relative clause modifiers). 16% of GRs are adverbial modifiers and only 1% of GRs are relative clause modifiers. Finally, it is important to point out that only 30% GRs belong to sub-categorization frame, thus, represent the arguments relations while as the 61% of GRs fall outside the sub-categorization frame, thus, represent adjunct relations.

Figure.24 Showing Proportion of Each GR



  1. Inter-annotator Agreement

One of the biggest challenges to a treebank project is maintaining consistency in annotations. It includes both, achieving significant inter-annotator and intra-annotator agreement. To check the inter-annotator agreement, two independent annotators need to annotate the same data with while as intra-annotator agreement can be achieved if an annotator encounters the same constructions or phenomenon many times during the course of annotation, the annotator annotates them consistently by sticking to the previous decisions regarding. Since, consistency increases the usefulness of the data for training or testing automatic methods for linguistic investigations. The understanding of various linguistic phenomena and the annotation guidelines is also often reflected in inter-annotator agreement studies. In order to check the consistency in the annotations of the current treebank, a dataset of 200 sentences was annotated by two annotators who had proper understanding of various issues and the guidelines for Kashmiri treebank. When the two annotated datasets were compared, a confusion matrix was formulated as shown in the Table.6. The matrix shows for which label and for how many times there is confusion. For example: in the first row of the table, there is confusion of adv with rt one times, with vmod two times, k7p two times, nmod one times, k2 one times, k7 one times, k7t one times and pof one times.

Inter-annotator agreement was measured using Cohen’s kappa (Cohen, at al., 1960) which is the mostly used agreement coefficient for annotation tasks with categorical data. Kappa was introduced to the field of computational linguistics by (Carletta et al., 1997) and since then many linguistics resources have been evaluated using the matrix such as (Uria et al., 2009; Bond et al., 2008; Yong and Foo, 1999). The kappa statistics show the agreement between the annotators and the reproducibility of their annotated datasets. However, a good inter-annotator agreement does not necessarily ensure accuracy of attachment labels as the annotators can make similar kind of mistakes and errors.

The kappa coefficient k is calculated as:

Pr (a) is the observed agreement between the annotators and Pr (e) is the expected agreement, i.e. the probability that the annotators agree by chance. Based on the interpretation matrix of kappa value proposed by Landis and Koch (Landis and Koch, 1977) as shown in Table.3, the agreement between two annotators on the data set used for the evaluation is reliable as given in the Table.4. There is a substantial amount of inter-annotator agreement which implies that there is similar understanding of the annotation guidelines and of the linguistic phenomenon found in the data. The label attachment score, agreement on only labels and agreement on only attachments are given in Table.5.







Kappa Statistics

Strength of Agreement

1

< 0.00

Poor

2

0.0-0.20

Slight

3

0.21-0.40

Fair

4

0.41-0.60

Moderate

5

0.61-0.80

Substantial

6

0.81-1.00

Almost Perfect

Table.3. Coefficients for the Agreement Rate


Observed Agreement

Expected Agreement

Kappa Value

0.77738515901060079

0.089149258949418789

0.75559679434126129

Table.4. Kappa Statistics


Label Attachment Score (LAS)

Agreement on Labels (LA)

Agreement on Attachments (UAS)

No Match

(NM)

0.5177619893428064

0.7380106571936057

0.6341030195381883

0.15008880994671403

Table.5. Kappa Statistics


S. NO

Labels

Confusions

1

adv

{'rt': 1, 'vmod': 2, 'k7p': 2, 'sent-adv': 1, 'nmod': 1, 'k2': 1, 'k7': 1, 'k7t': 1, 'pof': 1}

2

ccof

{'k1s': 2, 'rt': 1, 'vmod': 2, 'nmod__relc': 1, 'k2': 1, 'k1': 1, 'pof': 1}

3

fragof

{'pof': 1, 'ccof': 3, 'nmod': 1}

4

k1

{'k1s': 3, 'r6': 1, 'vmod': 1, 'k1u': 1, 'ccof': 1, 'k4v': 7, 'nmod': 2, 'k2': 14, 'pof': 2, 'k4a': 5}

5

k1s

{'k2s': 1, 'nmod': 1, 'k3': 1, 'k2': 12, 'k1': 3, 'k7t': 1, 'pof': 2}

6

k2

{'adv': 2, 'r6': 1, 'k4v': 4, 'k3': 1, 'ccof': 1, 'k1': 8, 'k4': 2, pof': 6, 'k4a': 3}

7

k2p

{'k7': 1, 'rh': 1, 'k2g': 1}

8

k2s

{'k2': 2}

9

k4

{'k2': 1, 'k4v': 6, 'k4a': 4, 'k1': 2}

10

k4a

{'k2': 1, 'k1': 2, 'k4': 1}

11

k4v

{'k1': 1, 'k4': 1}

12

k5

{'rd': 1, 'k7p': 1}

13

k7

{'vmod': 1, 'k2': 1, 'k2p': 1, 'k7p': 1, 'k1': 2, 'k7t': 2, 'k5': 1, 'rsp': 3}

14

k7p

{'rd': 1, 'k2p': 1, 'k7': 3, 'k7t': 1}

15

k7t

{'adv': 1, 'k7p': 1, 'k7': 1, 'vmod': 3}

16

nmod

{'vmod': 2, 'rs': 1, 'ccof': 1, 'k2': 1, 'k1': 1, 'k7': 2, 'k5': 1}

17

nmod__k1inv

{'nmod': 1}

18

nmod__k2inv

{'nmod': 1}

19

nmod__relc

{'fragof': 1, 'nmod': 1}

20

pk1

{'k1': 1}

21

pof

{'k2': 7, 'k1': 1, 'vmod': 1}

22

r6

{'k4v': 1, 'r6-k2': 1, 'k1': 1}

23

r6-k2

{'r6': 5}

24

r6v

{'k1s': 1, 'k7p': 1, 'k4v': 1}

25

rad

{'k7p': 2}

26

ras-k1

{'r6': 1, 'k7': 1, 'k4': 1}

27

rbmod

{'ccof': 1}

28

rh

{'k3': 2, 'ccof': 1}

39

rs

{'k2': 3, 'vmod': 1, 'k2s': 2}

30

rt

{'sent-adv': 1, 'rh': 4}

31

vmod

{'adv': 1, 'ras-neg': 1, 'sent-adv': 1, 'ccof': 3, 'pof': 1}

Table.6. Confusion Matrix Showing Disagreement Labels


  1. Yüklə 1,52 Mb.

    Dostları ilə paylaş:
1   ...   13   14   15   16   17   18   19   20   21




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin