Intelligence and Security Committee: the Woolwich Report, Facebook, and monitoring online interaction

Since Wednesday 25th November, the murder of Fusilier Lee Rigby has returned to the headlines after the government’s Intelligence and Security Committee published its report on the intelligence relating to his murder. The report notes that one of his murderers, Michael Adebowale had sent Facebook messages in which he,

expressed his desire to murder a soldier – in the most graphic and emotive manner – because of UK military action in Iraq and Afghanistan. (p127)

The report also states that,

had MI5 had access to this exchange, their investigation into Adebowale would have become a top priority. It is difficult to speculate on the outcome but there is a significant possibility that MI5 would then have been able to prevent the attack. (p7)

Whilst David Cameron was making a statement on this topic on that same day, Sir Tony Baldry, Conservative MP for Banbury, raised the following question:

Does not the Intelligence and Security Committee report indicate that social media companies need to do more to put in place systems to spot terrorist groups that are using their services to plan attacks?

The Prime Minister responded as follows (bold emphasis mine):

My right honourable Friend is right. The companies have to do two things. They have to have systems in place to spot key words, key phrases and other key things that could be part of terrorist plotting. They also need to have a system in place, in our view and as my right honourable and learned Friend the Member for Kensington (Sir Malcolm Rifkind) said, to report that to the authorities. This is linked to the point that I made in response to my honourable Friend the Member for South Dorset (Richard Drax). Because we have such a robust system of safeguards in this country, I do not think that it should be a problem for any of these companies to do just that.

From both the perspective of a corpus linguist and a forensic linguist, there are quite a few points to address here, so I will keep them as brief as possible:

Key words, key phrases, and “other key things”

There are issues here with both the dataset and the features. Let’s start with the data. Within corpus linguistics, we typically deal with millions or billions of words of data, and once collected, we work on this at leisure. However, social networks are producing literally billions of words per hour in a non-stop, live stream. Out of that stream, the majority of those interactions are likely to be nothing of concern, and finding that miniscule percentage of instances that need further investigation is a monumental task. In his answer to Baldry, Cameron suggests some features (key words, key phrases, and other key things) that might get us started, so let’s consider those.

One thing that every corpus linguist discovers very early on, especially with online interaction, is that you have to be very specific in your search, because the computer is not intuitive. It will give you back nothing more and nothing less than exactly what you asked for. Take the mundane example of the word and – to capture as many example of this as possible I would need to minimally search for all the variations that exist (and, &, +, n, nd, etc.) plus any mistaken variants (adn, @). But what if someone throws in a bit of French and uses et? Or the word in question has been replaced with a codeword or slang?

Another issue is that not all words have one clear-cut meaning. For example:

  • I’m going to bomb your house vs He’s the bomb vs The share prices will bomb
  • Shoot him vs I’ll shoot out for some bread vs Oh shoot!
  • Let’s murder someone vs These shoes are murder vs I could murder a drink!

As the above shows, words are not single, discrete units. Rather, they get (and give) meaning based on their context of use, but to analyse that context on a large scale requires automated tagging for metadata such as part-of-speech and semantic domain. This leads into rafts of problems.

Firstly, taggers typically deal with one language only – usually English, so a mixture of languages would be problematic. This is arguably fixable since language detection is reasonably straightforward, but there are not, as yet, part-of-speech taggers for all languages, and semantic taggers exist for fewer languages still.

Secondly, for best accuracy, taggers (and language detectors) rely on standard spelling. There is software that can “fix” non-standard spelling, but it can’t catch everything all the time.

Thirdly, correctly-spelt codewords will still fall through the cracks. If the codeword for gun is “sausage” that’s going to be tagged as a noun relating to the semantic domain of food.

And fourthly, from a logical perspective, there is little point searching for words like terrorism since they will be used far more often by people who are not terrorists. Never mind the fact that, ideologically, a terrorist is unlikely to think of themselves as such.

Key phrases are not immune to the above problems too, so when we add together issues of spelling variation, multiple meanings, codewords, slang, and other languages, simply trying to find online behaviour that requires further investigation using searches or word blacklists is likely to become prohibitively demanding, both for humans and machines.

Important note: if working with a reduced population of people who are already suspects, a lot of the above becomes much easier, but my feeling is that Cameron is looking for predictive abilities, i.e. the identification of new, emerging, and as-yet-unknown risks, i.e. not just those already on watchlists.

Back to the main argument: Cameron does also say “other key stuff” and from this we may have some hope, by utilising other methods alongside – or more accurately, before the analysis of key words and key phrases. For instance, we know that like-minded people tend to form networks. It’s one step in the radicalisation process, after all. Individuals also tend to accommodate towards those they admire (i.e. speak more like them) and away from those they dislike (e.g. by adopting exclusionary slang, insults, codewords, etc.).

So with the identification of one or two risky individuals, we can branch out to analyse the language of those who follow or interact with them in a supportive manner. Then, instead of starting with a random list of “suspicious” words or phrases and searching for those, we find out what words and phrases these networks, and the highest risk amongst them, use unusually frequently compared to non-risky networks who are otherwise similar in location, age, background, and so forth.

In this way we don’t have to know our key words or phrases in advance. The data tells us. Once we have that list, we can then start to search for other individuals who use those terms in similar ways to the high risk users, and see whether our method is taking us in the right direction. (This is, incidentally, very similar to the kind of research I’m doing on the DOOM project right now.)

Unfortunately, such analytical tools can tell us very little about intention. A post reading, I’m going to kill you should not be automatically referred to the police, since in context it may well be nothing more than a joke. However, it is possible to analyse language for emotion and some have used various types of linguistic analytic software to study personality disorders such as psychopathy. Here, though, the analysis shifts from a primarily automated one that a non-specialist could (with some training) carry out to a primarily manual one requiring a high level of expertise.

And there are other concerns. However good this type of analysis may or may not sound, it is built on several assumptions, e.g. that…

  1. …we’re happy for commercial enterprises like Facebook to analyse our private messages for more than just advertising (most of us don’t even want that!), e.g. for criminal behaviour and intention. (Do they stop at terrorism or hand over information about the guy who bragged about theft too? Can they use that information for other purposes?)
  2. …we’re okay with a potential shift from investigative bodies deciding what criminal activity needs looking at, to social networks
  3. …we’d prefer commercial entities who may be overseas to monitor our behaviour, rather than our local governments…

Monitoring online interaction

Had the data in question been Twitter or a similar, largely public-facing platform then this debate would have perhaps taken on a different hue, but the report seems to identify Facebook’s Chat function as the medium through which Adebowale’s most troubling communications were sent. In other words, he used a private, typically one-to-one feature that he presumably thought was accessible to no one but those invited into the Chat.

In recent months, the US and UK governments, including the NSA and GCHQ have come under enormous fire for projects such as Prism, which undertook mass surveillance of ordinary citizens in the name of identifying new and emerging threats. In turn, groups advocating for privacy have considered this to be a gross human rights violation. However, if it is true that, with access to Adebowale’s private Facebook Chat, there was a significant chance that MI5 could have prevented Fusilier Rigby’s death, then this leaves the most difficult issue of all: how do we balance the protection of individual privacy with the protection of human life?

By Dr Claire Hardaker (@DrClaireH)

Last updated: 23:45, 27 November 2014