Big data and the study of language and culture: Parliamentary discourse across time and space

Conveners: Jukka Tyrkkö (Linnaeus University/University of Turku), Minna Korhonen (Macquarie University), Haidee Kruger (Macquarie University/North-West University)

Workshop description

The last two decades have witnessed increasing interest in the use of large digitised archives as linguistic primary data. While these are not corpora in the strictest traditional sense (Leech 1992), they nonetheless potentially provide vast amounts of evidence for investigating cultural phenomena (Baker & McEnery 2016; Schneider 2018) as well as language, stylistic and register change (Millar 2009; Rühlemann & Hilpert 2017). At the same time, this kind of data pose diverse conceptual and methodological challenges for researchers.

This workshop is dedicated to one specific type of large cultural and linguistic data: parliamentary records. Perhaps the most well-known example of these in the English-speaking world is the Hansard, which has spread from Britain along with colonisation to other parts of the world. In addition to the British Hansard (, digitised archival records from various countries are widely available, including recently compiled specialised diachronic corpora of parliamentary records from Australia, Canada, New Zealand, and South Africa.

Although this field of research is still young, parliamentary records have already been used for research from a variety of linguistic perspectives ranging from critical discourse analysis and sentiment detection to historical sociolinguistics and register analysis. As diachronic data, parliamentary records allow the analysis of language and register change (Kruger & Smith 2018; Kruger et al. submitted; Korhonen 2018; Hou & Smith 2018; Macalister 2006) as well as of societal shifts reflected in language (Michel et al. 2010; Rheault et al. 2016; Alexander & Struan 2017; Tyrkkö and Nevala 2018; Tyrkkö accepted). While providing unparalleled opportunities for the empirical investigation of both language variation and change, parliamentary records also allow for the investigation of how a comparable register is reshaped by different contexts over time (Kruger et al. submitted). These records also include varying amounts of metadata on speakers, their political affiliations and backgrounds, and the topics of the debates and as such allow for the investigation of the role of individual speakers in language variation and change. Furthermore, as a hybrid spoken-written register, parliamentary records reflect the transformation from the actual spoken parliamentary discourse to the written record which may involve substantial editorial intervention, often altering spoken usage in the direction of norms for formal writing (Slembrouck 1992; Mollin 2007). The practices of record-keeping and transcribing parliamentary records have been varied across time and place, and their effect on compilation practices and the choice of methods in linguistic analyses has only recently been discussed in detail (Mollin 2007; Ryx 2014; Edwards 2016; Beelen et al. 2017; Hiltunen et al. 2018).

The conveners welcome contributions from any linguistic perspective that use one or more English-language parliamentary records as primary data. In addition to the various Hansards, we welcome contributions based on archives of the U.S. Congress or Senate, or similar data from other parts of the Anglophone world. The proposed papers should use quantitative methods, and make use of either the entire dataset, or a substantial part of it (e.g. in the form of a corpus based on particular design principles). Contrastive studies involving more than one parliamentary dataset, or other reference data, are also very welcome.

Potential topics include, but are not limited to:

  1. colloqualisation and democratisation
  2. register change in parliamentary debates
  3. parliamentary debates across varieties of English
  4. the use of parliamentary data to study language variation and change in English
  5. the role of individual speakers in language variation and change
  6. the transformation of spoken parliamentary discourse to written parliamentary records
  7. the linguistic use of parliamentary records as cultural artefacts.

The aim of the workshop is to promote best practices in the compilation and use of these specialised datasets, as well as to advance the use of computational, quantitative and corpus-based methods in the study of various aspects of political language. The papers presented in the workshop will be published as a collection edited by the conveners. The conveners will be in touch with the authors of the accepted papers prior to the conference.


