Wmatrix for forensic linguistics: a practical hands-on demo
Wmatrix was originally conceived in the REVERE project (1998-2001) as a web interface to facilitate the availability of Natural Language Processing (NLP) and Corpus Linguistics (CL) tools and methods to software engineers who were studying legacy systems through document archaeology alone (Rayson et al 2001, 2005). Since then, its web interface has been extended to expose more underlying details of the language analysis rather than hiding them away, and it has supported applications of NLP and CL methods in many other areas such as political discourse analysis, tracing facework, corpus stylistics, metaphor analysis, topic modelling, evaluating problem based learning and the language of illness. In the short talk at the beginning of this session, I will highlight applications in forensic, legal, and policing settings, for example: online child protection (Rashid et al 2013), predicting collective action (Charitonidis et al 2017), scientific fraud (Markowitz and Hancock 2014), and studies of the language of international criminal tribunals (Potts and Kjær 2015), sex offenders (Lord et al 2008), extremism and counter extremism (Prentice et al 2012), and psychopaths (Hancock et al 2013). In the remainder of the two-hour session, participants will follow the online tutorials which introduce the key semantic domains method. We will use the new version 4 of Wmatrix running on a dedicated server with secure HTTPS access, which went public in December 2018. Users will be provided with existing manifesto datasets but you are welcome to bring your own English corpora to upload.
Charitonidis C., Rashid A., Taylor P.J. (2017) Predicting Collective Action from Micro-Blog Data. In: Kawash J., Agarwal N., Özyer T. (eds) Prediction and Inference from Social Networks and Social Media. Lecture Notes in Social Networks. Springer, Cham
Jeffrey T. Hancock, Michael T. Woodworth and Stephen Porter (2013) Hungry like the wolf: A word-pattern analysis of the language of psychopaths. Legal and Criminological Psychology. Volume 18, Issue 1, pages 102-114. http://dx.doi.org/10.1111/j.2044-8333.2011.02025.x
Lord V, Davis B, Mason P. 2008. Stance-shifting in language used by sex offenders. Psychology, Crime & Law 14, 357-379.
Markowitz DM, Hancock JT (2014) Linguistic Traces of a Scientific Fraud: The Case of Diederik Stapel. PLoS ONE 9(8): e105937. doi:10.1371/journal.pone.0105937
Potts, A. and Kjær, A.L. (2015) Constructing Achievement in the International Criminal Tribunal for the Former Yugoslavia (ICTY): A Corpus-Based Critical Discourse Analysis. International Journal for the Semiotics of Law. doi: 10.1007/s11196-015-9440-y
Prentice, S, Rayson, P & Taylor, P 2012, ‘The language of Islamic extremism: towards an automated identification of beliefs, motivations and justifications’ International Journal of Corpus Linguistics, vol. 17, no. 2, pp. 259-286. DOI: 10.1075/ijcl.17.2.05pre
Rashid, A, Baron, A, Rayson, P, May-Chahal, C, Greenwood, P & Walkerdine, J 2013, ‘Who am I? Analysing Digital Personas in Cybercrime Investigations’ Computer, vol. 46, no. 4, pp. 54-61. DOI: 10.1109/MC.2013.68
Rayson, P., Emmet, L., Garside, R., & Sawyer, P. (2001). The REVERE project: Experiments with the application of probabilistic NLP to systems engineering. In Natural Language Processing and Information Systems – 5th International Conference on Applicationsof Natural Language to Information Systems, NLDB 2000, Revised Papers (pp. 288-300).
Sawyer, P., Rayson, P., & Cosh, K. (2005). Shallow Knowledge as an Aid to Deep Understanding in Early-Phase Requirements Engineering. DOI: 10.1109/TSE.2005.12
TIME & PLACE
1100-1300, Wed 16th Jan, Management School A001c (PC/Learning Lab)
The FORGE is delighted to announce our first external guest speaker: Dr Sam Larner (MMU). Details of his talk are below:
How Children and Young People Disclose Sexual Abuse: A linguistic analysis of NSPCC ChildLine online chat transcripts
THIS TALK IS ON A TOPIC, AND WILL CONTAIN EXTRACTS OF DATA, THAT SOME MAY FIND DISTRESSING.
DISCRETION IS STRONGLY ADVISED.
Research indicates that when children and young people make the difficult decision to disclose that they have been sexually abused, their linguistic capabilities may limit the extent to which they can make a full and clear disclosure. This may be problematic from a safeguarding perspective since the recipient of the disclosure may not realize or fully appreciate what the child or young person is trying to disclose, or even that an attempt at disclosure is being made. Whilst the process of, and barriers to, disclosure have been extensively researched, the linguistic strategies used to communicate disclosure have received relatively little attention. In order to provide a novel perspective, this research addresses the question ‘How do children and young people disclose that they have been sexually abused?’ Online chat conversations in which sexual abuse was disclosed (n=40) between children and young people (aged 10—18 years old) and ChildLine counsellors were analysed. Whilst some children and young people do use explicit terms to describe sexual abuse, these are predominantly used to seek definitions and clarification. Furthermore, counsellors play an instrumental role in recognising that a disclosure is being made, and then eliciting and reframing the disclosure as sexual abuse. The findings provide insight into why some victims of sexual abuse report having attempted to tell an adult but feel like they were not heard. This raises questions about how disclosures are made in other contexts and whether institutional safeguarding policies are fit for purpose.
Dr Sam Larner holds a BA (Hons.) in Linguistics from Lancaster University, an MA (Distinction) in Forensic Linguistics from Cardiff University, and a Ph.D. in Forensic Linguistics from Aston University. He is a Fellow of the Higher Education Academy, a member of the International Association of Forensic Linguists, and a member of the British Association for Applied Linguistics. Dr Larner’s experience in forensic linguistics spans over ten years. He joined Manchester Metropolitan University in 2015, and he has also held lectureships at the University of Central Lancashire and Newman University as well as giving guest lectures in the Czech Republic and Germany.
TIME & PLACE
1100-1200, Wed 12th Dec, County South B89
This is a test podcast episode. Apologies if it spams anyone. There is nothing worth seeing here.
The FORGE is pleased to announce the next speaker for this year’s seminar series: Dr Kirk Luther (Lancaster). Details of his talk can be found below:
Nudging Eyewitnesses: The Effect of Social Influence on Recalling Witnessed Events
Interviewing witnesses and victims (i.e., interviewees) is a core component of policing. Interviewers were likely not present when a crime was committed, and therefore must obtain information about what happened from interviewees. Due to the importance of interviews for solving crimes, researchers continue to explore ways to enhance interviewee recall. One promising area that has received relatively limited attention as an interviewing tool is social influence. The goals of the current experiment are to determine the extent to which various social influence techniques are able to enhance witness recall beyond what can be achieved when such techniques are absent, and to compare the relative performance of the social influence strategies.
TIME & PLACE
1100-1200, Wed 21st Nov, County South B89
The FORGE is pleased to announce the next speaker for this year’s seminar series: Dr Lara Warmelink (Lancaster). Details of her talk can be found below:
“If you go down in the woods today…”
Psychologists use different types of automatic language tagging to help analyse participants’ statements in a quick and low cost way. Erik Mac Giolla, Sofia Calderon, Kalle Ask, Timothy Luke and I (all psychologists) were studying the effect of veracity on people’s concreteness when speaking about future actions. We hypothesised that liars would be less concrete than truth tellers. We received data from 6 studies in which participants were interviewed about their future plans, with instruction to either lie or tell the truth. The statements’ concreteness was measured using two automatic language taggers: one based on a 40.000 word dictionary of words rated for concreteness (Brysbaert, Warriner, & Kuperman, 2014) and one based on the Linguistics Category Model (Seih, Beier & Pennebaker, 2017), which uses Treetagger and WordSmith. Both analysis showed that there was no difference between liars and truth tellers in their levels of concreteness. We also found no correlation between the two measures, which led to some concerns about the validity of one (or both?) of the measures. This talk will discuss the problems we encountered and invite your thoughts about the usefulness of operationalizing psychological concepts by language tagging.
TIME & PLACE
1100-1200, Wed 31st Oct, County South B89
The FORGE is delighted to announce our first speaker of the 2018-2019 academic year: Dr Claire Hardaker (Lancaster). Details of her talk are below:
THE DEMOS IS IN THE DETAILS: Are women really more misogynistic than men online?
In this talk I discuss the 2016 report written by Demos and presented to the House of Commons entitled “The use of misogynistic terms on Twitter”. In this report, Demos undertook “a small scale study examining the use of two popularly used misogynistic terms (‘slut’ and ‘whore’) on the social media platform Twitter” and found that “50% of the total aggressive tweets were sent by women, while 40% were sent by men, and 10% were sent by organisations or users whose genders could not be classified.” The research question in this project is simply this: is Demos right? This talk presents an overview of three follow-up studies – MEGASWAT, MINISWAT1, and MINISWAT2. It then presents more in-depth findings of the third study, MINISWAT2, in which 15,000 tweets were manually coded for author gender, target gender, and purpose. The results from these, unsurprisingly, differ from those found by Demos, but other key issues from the MINISWAT2 findings and about the Demos study are also highlighted.
TIME & PLACE
1100-1200, Wed 10th Oct, County South B89
In the coming academic year, FORGE will run a monthly writing retreat, starting in October 2018 and ending in April 2019. Full details can be found here.
With many apologies, we’ve noticed that the FORGE elves goofed and miswrote the time of Tim’s talk on Wednesday 22nd as starting at 12:00. Tim’s talk will in fact start at 13:00.
The elves have had their chocolate confiscated for the rest of the day.
The FORGE is delighted to announce our third and final external guest speaker: Prof Tim Grant (Aston). Details of his talk are below:
Taking language analysis to Court – How linguistic investigative advice, language evidence, and expert opinion are used in the UK justice system
This will be of especial interest to those looking to go into a career in forensic linguistics.
In this talk Tim Grant will examine the different roles through which language analysis can be used to improve the delivery of justice in the Courts. Through discussion of a series of cases in which he has been involved he will argue that forensic linguists, acting both as researchers and practitioners, need to focus on a broad variety of use cases and understand better how their analysis can be useful in the criminal and civil justice systems. He will examine the legal context in through which experts (including linguists) give evidence in Court and he will argue that forensic linguistic evidence needs to be methodological rigorous and admissible but also it must include clear and convincing explanation to provide the tryers of fact with a rationale basis for making their decisions.
Professor Tim Grant is the Director of the Centre for Forensic Linguistics at Aston University. Tim is on the Ethics and Professional Practice Committee of the International Association of Forensic Linguists and is a member of the Scientific Committee for the International Investigative Interviewing Research Group (iIIRG). Tim has extensive experience of providing linguistic evidence in a variety of cases including successful investigations into sexual assault, stalking, murder, and terrorism. Tim is particularly interested in forensic authorship analysis focusing on short messages such as text messages and Twitter posts, and he is also interested in how linguists can advise and train police officers to conduct better interviews. Tim’s work has appeared in featured newspaper articles and on BBC radio programmes. Furthermore, after providing a profile of a writer of roughly 60 racially and sexually abusive letters, Tim appeared as part of a media appeal on the BBC Crimewatch programme. This media appeal was successful in finding the offender, who matched the profile proposed by Tim.
TIME & PLACE
1300-1400, Wed 22nd Mar, Management School Lecture Theatre 6
FORGE is delighted to announce a talk by our upcoming internal speaker: Dr Bela Chatterjee (Law). Details of her talk are below:
Gender and cyberwarfare – a critical examination of key terms and images
In this presentation Dr Chatterjee considers the language and imagery surrounding popular discourses of cyberwarfare, and linking them to questions of gender. Drawing on popular cultural reference points such as James Bond’s Skyfall and newspaper cartoons, she considers potential gender dimensions to cyberwarfare and the possible implications of the gendered constructions of cyberwarfare for International Law discourses on cyber war.
TIME & PLACE
1300-1400, Mon 27th Feb, County South B89
All are welcome to attend.