The state of the art of corpus-based discourse studies
Case study: How is Islam constructed in the UK and US press before and after 9/11?
Lab session
Using Wmatrix to exploring political discourse: Michael Howard and Tony Blair’s farewell speech to their party
Discourse
Discourse
Language use above the sentence level
Language use in context
Real language use
CDA examines language as a form of cultural and social practice, focusing on the relationship between power and discourse, and between language and ideology
Both rely heavily on real language
Both rely heavily on real language
‘a cultural divide’ (Leech 2000: 678-680)
CDA emphasizes the integrity of text while CL tends to use representative samples
CDA is primarily qualitative while corpus linguistics is essentially quantitative
CDA focuses on the contents expressed by language while CL is interested in language (form) per se
The collector, transcriber and analyst are often the same person(s) in CDA while this is rarely the case in CL
The data used in CDA is rarely widely available while corpora are typically made widely available
Some important ‘points of contact’ (McEnery and Wilson 2001: 114)
Some important ‘points of contact’ (McEnery and Wilson 2001: 114)
The great potential of standard corpora in CDA as control data
Cons…
Cons…
The corpus-based approach tends to obscure ‘the character of each text as a text’ and ‘the role of the text producer and the society of which they are a part’ (Hunston 2002: 110)
CL focuses on text, not text producer
Analyzing a lot of text from a corpus simultaneously would force the analyst to lose ‘contact with text’ (Martin 1999: 52)
Pros…
Corpora present a real opportunity to discourse analysis, because the automatic analysis of a large number of texts at one time ‘can throw into relief the non-obvious in a single text’ (Partington 2003: 7)
Pros
Pros
‘Obviously, the methods for doing a ‘critical discourse analysis’ of corpus data are far from established yet. Even when we have examined a fairly large set of attestations, we cannot be certain whether our own interpretations of key items and collocations are genuinely representative of the large populations who produced the data. But we can be fairly confident of accessing a range of interpretative issues that is both wider and more precise than we could access by relying on our own personal usages and intuitions. Moreover, when we observe our own ideological position in contest with others, we are less likely to overlook it or take it for granted.’ (de Beaugrande 1999: 287)
Partington (2003: 12) proposes a scalar view of the uses of CL, pointing towards a rationale for using CL-related methods to carry out CDA
Partington (2003: 12) proposes a scalar view of the uses of CL, pointing towards a rationale for using CL-related methods to carry out CDA
‘At the simplest level, corpus technology helps find other examples of a phenomenon one has already noted. At the other extreme, it reveals patterns of use previously unthought of. In between, it can reinforce, refute or revise a researcher’s intuition and show them why and how much their suspicions were grounded.’
Partington (2004, 2006) provides a systematic description of CADS (corpus-assisted discourse studies)
Complementary to each other and interaction benfiting both areas of research
Complementary to each other and interaction benfiting both areas of research
CL can provide a general ‘pattern map’ of the data, mainly in terms of frequencies, key words/clusters and collocations, as well as their diachronic development (the latter contributing to the historical perspective in DHA: Discourse Historical Approach represented and pioneered by Ruth Wodak), which helps pinpoint specific periods for text selection or sites of interest
The CDA analysis can point towards patterns to be further explored through the CL lens and also provide explanations for corpus findings
CL can also examine frequencies (or at least provide strong indicators of the frequency) of specific phenomena recognized in CDA (e.g., topoi, topics, metaphors) by examining lexical patterns
CL can also examine frequencies (or at least provide strong indicators of the frequency) of specific phenomena recognized in CDA (e.g., topoi, topics, metaphors) by examining lexical patterns
CL can add a quantitative dimension to CDA to make it more objective
CL in general and concordance analysis in particular can be positively influenced by exposure and familiarity with CDA analytical techniques
CL needs to be supplemented by the close analysis of selected texts using CDA theory and methodology
CL needs to be supplemented by the close analysis of selected texts using CDA theory and methodology
CDA, in turn, can benefit from incorporating more objective, quantitative CL approaches, as quantification can reveal the degree of generality of, or confidence in, the study findings and conclusions in CDA
How do news stories construct Islam?
How do news stories construct Islam?
Have there been any changes before and after 9/11?
Are there differences between reporting on Islam (as a religion) and Muslims (as a people)?
Are there any differences/similarities between tabloids and broadsheets?
Are there any differences/similarities between American and British newspapers?
Post WWII – demand for unskilled labour results in migration of Pakistani and Bangladeshi Muslims to the UK
Post WWII – demand for unskilled labour results in migration of Pakistani and Bangladeshi Muslims to the UK
In April 2001 the former British Foreign Secretary Robin Cook reported that Britain’s national dish is chicken tikka masala
September 2001 – terrorist attacks on the US, believed to be associated with Islamic extremists
July 2005 – terrorist attacks on UK
UK and US newspapers in 1998-2005 (pre- and post-9/11)
UK and US newspapers in 1998-2005 (pre- and post-9/11)
87 million words of British news
Broadsheets (65 M words): The Business, The Guardian, The Independent & Independent on Sunday, The Observer, The Times & Sunday Times, Daily Telegraph & Sunday Telegraph
Tabloids (22 M words): The Daily Express & Sunday Express, The Daily Mail & Mail on Sunday, Daily Mirror & Sunday Mirror, The People, Daily Star & Sunday Star, The Sun
40 million words of American news
Financial Times, New York Times, Washington Post, San Francisco Chronicle
Alah OR Allah OR ayatolah OR burka! OR burqa! OR chador! OR fatwa! OR hejab! OR imam! OR islam! OR Koran OR Mecca OR Medina OR Mohammedan! OR Moslem! OR Muslim! OR mosque OR mufti! OR mujaheddin! OR mujahedin! OR mullah! OR muslim! OR Prophet Mohammed OR Q'uran OR rupoush OR rupush OR sharia OR shari'a OR shia! OR shi-ite! OR Shi'ite! OR sunni! OR the Prophet OR wahabi OR yashmak! AND NOT Islamabad AND NOT shiatsu AND NOT sunnily
Alah OR Allah OR ayatolah OR burka! OR burqa! OR chador! OR fatwa! OR hejab! OR imam! OR islam! OR Koran OR Mecca OR Medina OR Mohammedan! OR Moslem! OR Muslim! OR mosque OR mufti! OR mujaheddin! OR mujahedin! OR mullah! OR muslim! OR Prophet Mohammed OR Q'uran OR rupoush OR rupush OR sharia OR shari'a OR shia! OR shi-ite! OR Shi'ite! OR sunni! OR the Prophet OR wahabi OR yashmak! AND NOT Islamabad AND NOT shiatsu AND NOT sunnily
Corpora split into 4:
Corpora split into 4:
All sub-corpora compared to a reference corpus (BNC written – 90 million words)
3. UK sub-corpora compared with US sub-corpora
4. Keywords extracted and analysed via concordances with respect to moral panic categories
5. UK broadsheets vs. UK tabloids
6. Collocational and concordance analysis of Islam, Islamic, Muslim, Muslims
Conceived by Stanley Cohen (1972) in his study of Mods and Rockers in the UK
Conceived by Stanley Cohen (1972) in his study of Mods and Rockers in the UK
Violent clash between the gangs of Mods and Rockers in 1964
Two conflicting British subcultures in the mid 1960s
Referring to the intensity of feeling expressed by a large number of people about a specific group of people who appear to threaten the social order at a given time
‘The vast, vast majority, of Muslims living in the UK support policing efforts, fear terrorism and want to work with us," said [Sir Ian].’ (The Guardian, October 29, 2004).
“Children are being brainwashed into becoming Islamic extremists at 300 "Taliban schools" in Britain, it was reported last night. Youngsters are being indoctrinated with radical Islamic ideals by militant groups across the country, said leading British Muslim Dr Zaki Badawi.” (The Sun, December 28, 2001)
Also, ’scrougerphobia’ and political correctness
In the tabloids, Muslims are fanatics and extremists
In the tabloids, Muslims are fanatics and extremists
In the broadsheets, Muslims are radicals, fundamentalists, separatists but also moderates and progressives
More focus on Islam
More focus on Islam
The media: book, novel, television, film, poetry
Other religions: Hindu, Christian, Buddhist, Judaism
World events: Iran, Iraq, Iraqi, Arab, Israeli, Israel, Palestinian, Baghdad, Jerusalem, Lebanon, Syria
War and conflict: military, conflict, army, resistance, violence, occupied, ceasefire, genocide, peace, invasion
Tabloids: more focus on Muslims (the people)
Tabloids: more focus on Muslims (the people)
Muslims as terrorists; evil preachers, Muslims as British and desiring peace, women as victims (honor killings, arranged marriage, hijab), men as potential terrorists or victims of racism
Broadsheets: more focus on Islam (as a religion)
Stories on terrorism restricted to the word Islamic
Use Wmatrix to tag the following two texts
Use Wmatrix to tag the following two texts
Tips: It’s a good practice to create one folder for each file
Michael Howard’s farewell speech to his party (2005)
A new screen will provide you with an update report … e.g.
Find the “key words compared to:” drop-down menu, and click Go
Find the “key words compared to:” drop-down menu, and click Go
IMPORTANT
IMPORTANT
– anything above LL 15 = 99.99% confidence of significance
– anything above LL 6.63 = 99% confidence of significance
How many keywords from the Howard text have LL values of 15+? What are they?
How many keywords have LL values of 7+? What are they?
Do you notice anything interesting about these keywords?
Do any of the keywords share the same semantic fields?
Find the “key POS compared to:” drop-down menu, and click Go
Find the “key POS compared to:” drop-down menu, and click Go
What do you notice about the “key” domains?
What do you notice about the “key” domains?
Do we capture more words by undertaking a key domain analysis than we do by undertaking a keyword analysis? And, if so, why do you think this is the case?
Undertake a keyword analysis of Blair (using Howard as the reference corpus) to determine the differences between the two speeches