Language and knowledge have a symbiotic relationship. The growth of knowledge contributes to natural languages and language is perhaps one of the most versatile and accessible systems for creative work – the search for new ideas being amongst the most creative. The edifice of science involves a number of semiotic systems – language, mathematical symbols and notations, graphical notations, symbol sets used in chemistry, electronic engineering and architecture. What usually survives almost without change is the linguistic description of a discovery, of a new method/technique, of a critique of existing discoveries, methods and techniques. The surviving texts, from Newton’s Principia to Einstein’s General Theory of Relativity, and from Darwin’s Origin of Species to modern day outpourings on evolutionary biology, may contain mathematical symbolism that may eventually appear arcane, graphical notations that are incomprehensible to most, but much of the texts have some resonance to the typical speaker of English.
Scientific writing, if it at all exists, appears to be a persuasive and informative style of writing. Typically, a discoverer or a critic of an extant science will write to persuade his or her peers that the discovery or critique has to be taken on par with the historical or contemporaneous effort in the discoverer’s or critic’s field of endeavour. This discoverer especially introduces concepts and ideas alien to extant scientific knowledge: Newton tried successfully to introduce the notion of force; Einstein helped literally and metaphorically to introduce non-determinism; Darwin redefined the immutability of species into the evolution of species.
Once a new idea or concept takes root, then the consolidators get to work: colleagues of the discoverer/critic, often an army of budding young scientists, take on the mantle and ‘run with’ the new idea or concept. The terms coined and redefined by the original masters/mistresses become current and the rest belongs to the terminologists.
There are a number of authors who have looked at scientific writing: Prominent amongst these authors are Michael Halliday, Robert de Beaugrande, Charles Bazerman, and Alan Gross. These authors, who have a different outlook on specialist writing, ranging from dialectal materialism to postmodernism, describe how complex concepts are articulated in scientific writing. I will examine the work of these scholars, the equivalent of literary critics, the critics of LSP texts, with a view to articulate methods, techniques, rules or practices that the scholars have used to systematically criticise scientific texts.
Evolution and change in the knowledge of a subject can be shown through evolution and change in the way the knowledge is made explicit. Where new discoveries are made, or in a business context new products are developed or new markets found, the texts – journal papers, white papers, technical specifications and so on – created around this change will gradually amass, eventually suppressing the previous knowledge to some extent. These are the kinds of changes that are measurable, as will be discussed.
Frequency and concordance analysis techniques, originally established to partly validate religious texts, are well established. Frequency analysis relates to counting words and letters within a given text and comparing them with a reference distribution of words and texts – a mechanical check on change. In modern times this method has been applied to tasks such as attributing texts to authors, analysing transcripts in court cases, and automatic text categorisation. The corpus linguists suggest that text corpora can be used to validate theories about language at a fixed point in time (synchronically) and over a period of time (diachronically). Methods and techniques in corpus linguistics complement those in information extraction and information retrieval, allowing us to advance the claim that one can understand documents synchronically and diachronically.
In this paper we present a method for systematically analysing texts with specific references to discontinuities in science. In order to substantiate some of the speculation above, largely on my part, I will attempt to outline a method for analysing scientific texts that is based on methods/techniques used in corpus linguistics – these folk are adept at classifying and cataloguing pragmatically disparate texts – and methods used in LSP texts, particularly the work of Christer Laurén on scientific ‘idiolects’. I use these methods, supported by a suite of computer programs, to examine texts in the emerging specialism of quantum computers to describe how language, especially English language, is used to describe a set of devices that are yet to be fabricated using materials whose properties show radical departure from the accepted notion of what a (semi-conducting) material behaves.
According to Webster, 'semiconductor' originated in 1838 and is defined as "any of a class of solids (as germanium or silicon) whose electrical conductivity is between that of a conductor and that of an insulator". Semiconductors have become key components of electronic devices as transfer resistors (transistors) and through very large scale integration (VLSI). This branch of physics is concerned with electronic phenomena, particularly the application of quantum theory, that can be used to overcome restrictions of conventional electronic devices due to scalability, speed and heat. The next innovation within this field could well form that basis for the next generation of electronic devices.
We will demonstrate a method for facilitating discovery of innovations within the semiconductor field. Through terminology extraction and interaction with a domain expert, we will show how a domain can be modelled and how terms occur in combinations where new (conceptual) distinctions are necessary. This work has been carried out primarily using scientific journals as an assurance of data quality.