Statistics & data visualisation

The summer school in Statistics and Data Visualisation for Corpus Linguistics is aimed at students and researchers with a background in corpus linguistics who wish to learn more about the use of statistics to explore language corpora. No prior knowledge of statistics is required.

The summer school offers a practical introduction to the statistical procedures used for the analysis language corpora. The curriculum provides an overview of the main statistical procedures used in the field of corpus linguistics together with simple examples of application of these methods.

This summer school is organised and taught by Vaclav Brezina with contributions from other staff from Lancaster University. Vaclav Brezina is a Research Fellow in the Centre for Corpus Approaches to Social Science and the author of upcoming volume ‘Statistics for Corpus Linguistics: A Practical Guide’ to be published by the Cambridge University Press in July 2018.

The topics include, for example:

  • Null hypothesis significance testing and effect sizes
  • Sampling methods and representativeness
  • Frequency and dispersion; descriptive and inferential statistics
  • Register variation and multi-dimensional analysis

To find out more about the school you can read a blog post by Stefania Maci who participated in the summer school last year. In her blog, Stefania describes how the research started with a group of fellow summer schools participants in Lancaster in the summer of 2017 took them to a presentation at an international corpus conference.