Below is a brief (and therefore incomplete) analysis of the email allegedly sent by IS (ISIS) and reproduced in the media. If you choose to quote from this post, please link back to it so that readers can see the fuller analysis in context. I am using the Telegraph’s version of the email simply because it came up first and matches several others that I cross-checked it with. The language of the email contains some extra information beyond the words themselves, including suggestions of the method of production, and some information about the author. It is crucial to note, however, that these are suggestions only. (Apologies in advance for any typos. This has been written rather quickly.)
THE METHOD OF PRODUCTION
The first thing to be aware of is that the method of production, i.e. both the hardware and software used to send the message, will affect the message itself. In other words, from the various artefacts of production left in the email, we can possibly identify the kinds of devices being used by the author. From an investigative perspective, this can help to give focus to a search by suggesting what to look for, e.g. an iPhone, a PC, etc. However, interference particularly from the software can also mean that useful clues in the language itself may be obscured, and red herrings can be introduced.
Different software varies in precisely what it will and won’t change, so on default setting, Microsoft’s PC-based products (Word, PowerPoint, Outlook) typically autochange straight quotes and apostrophes ("", '') to curly ones (“”, ‘’) – a feature known as Smartquotes, whereas Apple’s mobile products (iPhone/iPad email, messenger, notes, etc.) typically don’t. The CSS behind web-pages can do exactly the same thing which is why it is so important to know whether the Telegraph’s version is an exact reproduction of the original email or not. In the email in question, assuming that it has been 100% faithfully replicated, we can see curly quotes round “proxy armies” and “Arabic translation”, and a curly apostrophe in LION’S DEN. Assuming that the author(s) didn’t exhaustively put curly quotes in, this suggests that the software used to send the message is of the kind to autochange punctuation. From this, we can tentatively suggest that something like an iPhone or iPad may not have been used. Of course that doesn’t rule out another brand of mobile device, but given the prevalence of Apple products and their portability, this is still a useful first hint.
Other interesting artefacts include the spelling errors and possible typographic autocorrections, which are once again indicative of a possible software provenance. For instance, on its default setting, Microsoft spell-check typically ignores words in uppercase, whereas Apple products typically don’t, and in the email we find the majority of the spelling and grammatical errors in the uppercase portions, such as SHEPPARD, WHERE (instead of WERE), and UNTILL. This appears to further support the notion that the software used was not spell-checking uppercase text. Because of this, in those uppercase sections, we may well be getting a more faithful version of the author’s own style.
Another feature in the main text is the word Arial (presumably meant to be aerial) and this appears to have been not only (incorrectly) corrected but also given title-case to reflect the fact that Arial is a name. Again, this tentatively suggests software that autocorrects, rather than an author-based change.
American English versus British English versus N.E. Other English
One disappointing aspect that we don’t see are features such as color versus colour, though as above, even if we had, we would have to be aware that software can autocorrect from one variety of English to another. It does mean, however, that we have to be careful about ascribing a variety of English (e.g. American, Australian, British, Canadian, etc.) to the author.
Now that we have laid the groundwork for realising just how much messages can be affected by the method of production, we can start to tentatively look at the language itself to see what this can tell us about the author. There are at least three issues to be aware of here. Firstly, when shifting from speech to writing, we almost always increase our level of formality, and the complexity of our language. Speech is full of errors and false starts and pronunciations that can give us some very specific clues that might suggest aspects of an author’s identity. Writing doesn’t have as many, though it does contain some worthwhile markers. Online language can be especially rich for this. Secondly, writing can be a multi-author affair, so this email may have been a collaborative work by several people all trying to achieve the best effect. Thirdly, writing (particularly emails) can be extensively revised and redrafted before being sent, thereby allowing many more opportunities to affect the type of message, making it less like one that might come out if sent quickly, with little thought, or under pressure.
With all this in mind, we can’t really tackle questions like, “Where in the UK did this author come from?” (not least because that makes several big assumptions already). Probably the safest question we can address is, “Is this author (assuming there is only one) a native speaker of English?”
Throughout the email, we find examples like transgressions, proxy, translation, detention, transactions, and so forth. Low-frequency lexis (words that don’t usually appear in our core, day-to-day vocabulary) are less likely to be used by non-native speakers due to lack of exposure to and need for them. Even a native speaker might not use some of these words, so this may suggest both a native English speaker, and possibly also one that is reasonably erudite. (Note that it is overly-simplistic to conflate spelling errors with intelligence/education. A very intelligent person can be terrible at spelling, they may have dyslexia, or they may be careless, tired, or distracted, so it would be a dangerously false premise to use the typos to judge levels of education and intelligence. Additionally, the typos that are made in this email are those regularly found in the writing of native English speakers.)
The author uses a number of fairly complex grammatical constructions that include independent and dependent clauses as well as coordinated sentences, e.g. We have also offered prisoner exchanges to free the Muslims currently in your detention like our sister Dr Afia Sidiqqi, however you proved very quickly to us that this is NOT what you are interested in. There is also correct use of the possessive apostrophe (LION’S DEN) within a capitalised section (i.e. a section that some types of software will not correct).
Another aspect that occurs is this interesting pair of sentences: “You and your citizens will pay the price of your bombings! The first of which being the blood of the American citizen, James Foley!” The second sentence is a non-standard construction – we might expect the verb “being” to be realised as “will be”, and from intuition alone, we might expect such a construction to occur in a dialect more associated with the working class. However, it would be necessary to do an exhaustive search of more data (e.g. the COLT corpus), and I would be equally unsurprised to hear a teacher or an MP use such a construction, particularly in speech, so again, without reference to wider data, we can only draw extremely tentative conclusions from this.
One of the most advanced ways that we use language is when we are being creative, particularly when we use intertextual references (references to other culturally known works), metaphors, and literary language. Because this is so difficult and requires substantial cultural knowledge to achieve successfully, we are less likely to find it occurring in the language of non-native speakers. In this email, we find several examples of metaphors, literary language, and creativity, e.g.:
- HOW LONG WILL THE SHEEP FOLLOW THE BLIND [SHEPHERD]?
- …the language of force, a language you were given in “Arabic translation”…
- …THEY DARED TO ENTER THE LION’S DEN AND [WERE] EATEN!
- …WE WILL NOT STOP [UNTIL] WE QUENCH OUR THIRST FOR YOUR BLOOD.
When we take together the breadth of low-frequency vocabulary, the complexity of the grammar, and the creativity of the style, altogether the language of the email strongly suggests an author who appears to have native-like competence in English.
By Dr Claire Hardaker (@DrClaireH)
Last updated: 14:00, 22 August 2014