Missing Melania: is the First Lady’s Twitter account being used by Trump?

As we speak (01st June 2018), interesting conspiracies are breaking across some news networks about Melania Trump. The First Lady has allegedly not been seen in public for twenty days, and according to White House sources, she is recovering from an operation. However, the increasing concern for her welfare was heightened with the latest (as of this moment) tweet from her FLOTUS account on Wednesday 30th May which reads:

I see the media is working overtime speculating where I am & what I’m doing.  Rest assured, I’m here at the @WhiteHouse w my family, feeling great, & working hard on behalf of children & the American people! (Link)

The reaction by some to this tweet was an immediate cynicism that she had written it. Some felt that it was authored by Trump, and others demanded to see her holding a current newspaper. So, is that tweet truly unusual for the First Lady? Or is this a case of mass-confirmation-bias, where many people who already hold Donald Trump in contempt have simply found another possible avenue of attack? I thought I’d have a look at it out of curiosity and see what I could see.

Caveats

I gave this about three hours of my time in total because that’s all I had. Additionally, I have never followed either Trump or the First Lady’s Twitter accounts, so I have no automatic expertise in their respective styles. However, through the media, I’ve been exposed to enough of Trump’s tweets over the past year and a half to give me a sense of his… oeuvre.

Research questions

  • Is the questioned tweet consistent, or inconsistent with other tweets supposedly by the First Lady?
  • Is the questioned tweet consistent, or inconsistent with other tweets supposedly by Trump?

Data

I used FireAnt‘s “User tweet history” function to collect all the tweets, excluding retweets, by @FLOTUS and @MELANIATRUMP, which gave me a little corpus (hereon, FLORPUS) of 1,336 tweets. Then I collected a recent sample of @realDonaldTrump‘s tweets, which gave me another little corpus (hereon, PORPUS) of 2,856 tweets. Want to follow along at home? You can download the various versions of the data, including FLORPUS and PORPUS, from the links below:

Description Link Size
This is the “User tweet history” JSON file for @FLOTUS, excluding “official” retweets 00_flotus_raw.json 1.1mb
This is the “User tweet history” JSON file for @MELANIATRUMP, excluding “official” retweets 00_MELANIATRUMP_raw.json 3.4mb
This is the “User tweet history” JSON file for @realDonaldTrump, excluding “official” retweets 00_realDonaldTrump_raw.json 9.1mb
This is the FLORPUS TXT file. It contains only the texts of the tweets from the @FLOTUS and @MELANIATRUMP JSON files where the source is marked as iPhone. It contains all URLs, handles, hashtags, and so forth. 01_florpus_raw_text.txt 96.9kb
This is the PORPUS TXT file. It contains only the texts of the tweets from the @realDonaldTrump JSON files where the source is marked as iPhone. It contains all URLs, handles, hashtags, and so forth. 01_porpus_raw_text.txt 409kb

But before we get stuck straight into the analysis, we have at least two problems:

Problem #1: multiple authorship

We can’t be automatically sure that the tweets sent by these accounts were always sent by the name on the tin, so to speak. Both have aides and staff who regularly use the main accounts to undertake diplomatic and presidential shoulder-rubbing and social-wheel-greasing. To some extent, though, we can do a little bit of weeding. When staff send such content, they are likely not using the president or First Lady’s own device. They may well be using something else. For instance, here are the last five tweets in PORPUS sent via Media Studio, all from May:

To the @NavalAcademy Class of 2018, I say: We know you are up to the task. We know you will make us proud. We know that glory will be yours. Because you are WINNERS, you are WARRIORS, you are FIGHTERS, you are CHAMPIONS, and YOU will lead us to VICTORY! God Bless the U.S.A.!

I have decided to terminate the planned Summit in Singapore on June 12th. While many things can happen and a great opportunity lies ahead potentially, I believe that this is a tremendous setback for North Korea and indeed a setback for the world…

It was my great honor to host a roundtable re: MS-13 yesterday in Bethpage, New York. Democrats must abandon their resistance to border security so that we can SUPPORT law enforcement and SAVE innocent lives!

Today, it was my great honor to celebrate the #NationalDayOfPrayer at the @WhiteHouse, in the Rose Garden!

Our two great republics are linked together by the timeless bonds of history, culture, and destiny. We are people who cherish our values, protect our civilization, and recognize the image of God in every human soul.

Even though the second deals with a setback and the third pushes the opposing party to change its actions, overall there’s a general sense of positivity, diplomacy, and stateliness.

Meanwhile, here are the last five tweets sent from an iPhone (the primary sender of tweets for the @realDonaldTrump account):

Looking forward to seeing the employment numbers at 8:30 this morning.

Why aren’t they firing no talent Samantha Bee for the horrible language used on her low ratings show? A total double standard but that’s O.K., we are Winning, and will be doing so for a long time to come!

A.P. has just reported that the Russian Hoax Investigation has now cost our government over $17 million, and going up fast. No Collusion, except by the Democrats!

FAIR TRADE!

Will be giving a Full Pardon to Dinesh D’Souza today. He was treated very unfairly by our government!

Quite a bit more, er, informal. Note the common pattern of eliding elements such as the subject and various verbs in nearly every tweet (I am looking forward; It is a total double standard; There is No Collusion; I will be giving). Also the tone is much more personal, for want of a better description.

To tackle the issue, an imperfect solution is to only analyse tweets sent from the devices that seem most likely to be used by either POTUS or FLOTUS. So how does that look:

SOURCE PORPUS TWEETS FLORPUS TWEETS
Twitter for iPhone 2,579 1,151
Media Studio 139 26
Twitter Web Client 44 121
Twitter Ads 33 0
Twitter for iPad 31 2
Twitter for Android 30 0
Twitter Lite 0 30
Facebook 0 2
Twitter for Websites 0 2
Safari on iOS 0 1
iOS 0 1

In different words, it looks like each uses an iPhone, and indeed, the controversial tweet was sent from this source, so only tweets identified as originating from iPhones are considered. That reduces PORPUS to 2,579 tweets and FLORPUS to 1,151.

Note, however, that this still doesn’t “purify” our authorship. For simplicity, we are only considering here “simple” forms of authorship – Trump picks up his phone, Trump types, Trump sends. (Or the First Lady does.) However, I could toss my phone to you, dictate the message, and have you tweet it. My words, but your linguistic shortcuts. And Trump could have had the First Lady dictate a message to him. Alternatively she could have asked an aide to send it for her. Or an aide could have typed something up, which she has checked, edited, and okayed. Or someone else could have written and sent it with no input from her or Trump. The research questions above and analysis below constrain themselves to two possible authors only – Trump versus the First Lady – when there are plenty of other people who could have been partly or fully responsible.

And there’s another more serious problem…

Problem #2: the moment of interception

If we take the conspiracy theory to its grimmest extreme and the First Lady really is being held captive, or is even no longer with us, her account could have been taken over as much as twenty days ago. In fact, it could have even been before that. Seeing her in public – even seeing her on her iPhone – does not automatically mean she has access to her own accounts, after all. But she would have the opportunity to make a plea for help, so for simplicity we will assume that the date of her last public appearance is also the date of the last of the tweets we can assume to have been authored by her. That takes us back to roughly Friday 08th May, around the time that Melania launched her BeBest campaign.

For safety, then, I marked all FLORPUS tweets sent in this “grey zone” from the 09th to the 28th as also being of questionable authorship. Since I didn’t have time to analyse them all, I took the very convenient way out of simply ignoring them. (This is a blog. I can do that. I want to get to my chocolate bar sometime today.) Anyway, that window captured seven tweets, including the one that has sparked the controversy.

Overall, then, the final FLORPUS stands at 1,144 tweets (~12,000 words), and the final PORPUS stands at 2,579 (~67,000 words). Note that word-counts are somewhat meaningless here. Do I count hashtags? URLs? Handles? Because chocolate awaits me, for speed, I’ve left them all in for both and gone instead with pointing out how useless word-counts are.

Aaaaanyway, some analyses.

Analysis

One way to begin is to return to the disputed tweet:

I see the media is working overtime speculating where I am & what I’m doing.  Rest assured, I’m here at the @WhiteHouse w my family, feeling great, & working hard on behalf of children & the American people! (Link)

Because people weren’t especially specific about their suspicions, and because I don’t know Melania’s general style well enough to draw on my own repertoire of features that might be unique to her, all I can do is look at this tweet and identify potential features that might distinguish it either from the others she’s sent, or from Trump’s. In the end, I picked three things that could be relatively easily checked:

  • Variation in the use of with
  • Ten different kinds of punctuation
  • General lexicon (vocabulary)

With

ITEM PORPUS TWEETS – 2,579 FLORPUS TWEETS – 1,144 DISPUTED TWEET
with 604 (0.234 per tweet) 71 (0.062 per tweet) 0
 w 0 (0 per tweet) 18 (0.015 per tweet) 1
w/ 37 (0.014 per tweet) 66 (0.057 per tweet) 0

What does this tell us? Well, Trump prefers to write with, in full, and when he doesn’t, he uses the w/ variant. In the tweets I collected, he never uses the w variant that stands alone between whitespaces. Meanwhile the First Lady has at least three choices in her repertoire. Her favourite is the full version, her second favourite is the w/ and her least used is the straight w. But she uses it, and that’s what occurs in the disputed tweet.

Punctuation

ITEM PORPUS TWEETS – 2,579 FLORPUS TWEETS – 1,144 DISPUTED TWEET
Ampersand 448 (0.2 per tweet) 213 (0.2 per tweet) 3
Comma 3,075 (1.2 per tweet) 110 (0.1 per tweet) 3
Full-stop 6,038 (2.3 per tweet) 1,543 (1.3 per tweet) 1
Full-stop-double-space 27 (0.01 per tweet) 27 (0.02 per tweet) 1
Double-space 128 (0.05 per tweet) 109 (0.1 per tweet) 1
Exclamation mark 1,896 (0.7 per tweet) 609 (0.5 per tweet) 1
Question mark 200 (0.08 per tweet) 52 (0.04 per tweet) 0
Apostrophe 623 (0.24 per tweet) 106 (0.09 per tweet) 2
Hashtag 385 (0.15 per tweet) 979 (0.85 per tweet) 0
At-snail 583 (0.2 per tweet) 982 (0.9 per tweet) 1

There are so many problems with this kind of analysis it’s hard to know where to start. Primarily the issues are the shortness of the disputed data, opportunities for occurrence, normalisation of the frequencies (is per tweet or per word better?), and the wild variability of punctuation use based on context. These issues would mostly smooth out if our disputed dataset were a thousand times bigger, but it isn’t, so, like the This Is Fine dog, drinking coffee as the house burns down, let’s pretend that IT’S ALL FIIIIINE.

Okay good.

If we treated this as a straight shoot-out (which is ridiculous for so many reasons), only considering features that occur and ignoring any that were not used in the disputed tweet, then it’s three to PORPUS, and four to FLORPUS. In the interests of science, I have to say yet again that this is mostly-nonsensical, but even if it wasn’t, the differences are too close to form any firm conclusions. No matter how determined you are to see guilt in here somewhere, this shows us about what you’d expect – not very much.

And that is actually rather interesting in itself.

For instance, we might assume that Melania and Donald text each other occasionally. (I know, I know, just go with it.) They’ve been married a long time and they have been exposed to each other’s dialectal variations and preferences and styles for a long time, so it’s not remarkable that they may have started to adopt each other’s linguistic mannerisms. Despite this, though, our little corpora do show one intriguing difference: Trump uses ten times as many commas as the First Lady. If we were to actually get serious about this, this would be one feature we could pay more attention to, because there are enough occurrences and opportunities for occurrence for it to be halfway reliable.

Yet again, though, by itself, this doesn’t tell us much of anything.

Lexicon

After reading through responses to Melania’s tweet, I noted that some users had focussed on the choice of words, so I went back and did a very crude shoot-out at the lexical level, rather than at the level of graphology (spelling variation) or punctuation. A basic comparison is presented below:

DISPUTED TWEET PORPUS TWEETS – 2,579 FLORPUS TWEETS – 1,144
the media 19 (0.7 per 100 tweets) 0 (0 per 100 tweets)
working overtime 7 (0.3 per 100 tweets) 0 (0 per 100 tweets)
speculating 1 (0.04 per 100 tweets) 0 (0 per 100 tweets)
rest assured 0 (0 per 100 tweets) 0 (0 per 100 tweets)
my family 0 (0 per 100 tweets) 5 (0.4 per 100 tweets)
feeling great 0 (0 per 100 tweets) 0 (0 per 100 tweets)
working hard 26 (1 per 100 tweets) 1 (0.09 per 100 tweets)
on behalf of 15 (0.6 per 100 tweets) 4 (0.4 per 100 tweets)
children 11 (0.4 per 100 tweets) 41 (3.6 per 100 tweets)
American people 27 (1.04 per 100 tweets) 0 (0 per 100 tweets)

What does this tell us? Well, it certainly leans toward a Trump-esque vocabulary. It addresses the media, a topic with which Trump is quite fixated. And it uses three phrases – working overtime, working hard, and American people – that do not appear or only appear once in the First Lady’s tweets, but, by contrast, appear relatively often in Trump’s. I want to stress the word relatively, for reasons I will go into below.

Is this a smoking gun?

As far as I am concerned, no. It gives more pause for thought than the with variant and the punctuation choices did, yes, but still the picture is not clear. For a start, because of an issue known as Zipf’s law, the numbers involved are tiny. I’ve had to use a “per 100 tweets” metric because “per tweet” figures were simply yielding too many zeros for the numbers to be intuitively comparable any more. Whatever the metric, as a collection of features, it is mildly suggestive, but the occurrence of all of these words in the disputed tweet are within the reach of chance.

That leads to the second point: it is unsurprising that, for instance, the media occurs in the disputed tweet. The attention is coming primarily from this quarter, after all. The information we need here – would the First Lady address the media like this in a scenario like this? – lies beyond the realms of linguistics.

Thirdly, mentions of children and family seem to be more of a concern for the First Lady in these datasets than for Trump, and they appear in the disputed tweet. This said, however, were someone attempting to imitate the First Lady, then tweeting about those very topics would be a fairly obvious and easy way to emulate her. Because this level of language is much more consciously available to us than, say, punctuation choices, it is the more easily manipulated, and that complicates the picture considerably.

Conclusion

It’s a messy one.

The tweet uses a with-variant that, in my data at least, is exclusive to the First Lady. It also uses lots of commas, which, in my data, are generally more of a Trump choice. And some of the lexicon is First Lady-esque, whilst other bits are Trump-esque. If these features all “weighed” the same, we could call it a straight draw, and for the purposes of a mere blog post, that’s what we will have to do. However, this exemplifies exactly why forensic linguistics is really hard. In so many cases, just like this one, we simply don’t have enough disputed data. If the First Lady doesn’t appear in public for another hundred days and her account continues to issue endless tweets, then we’d have a fighting chance at making any kind of half-sensible conclusion, but as it stands, we’re no closer to a meaningful answer from this analysis than we were at the start. So, back to the two research questions:

  • Is the questioned tweet consistent, or inconsistent with other tweets supposedly by the First Lady?
    • Consistent: with variant that the First Lady would use
    • Consistent: words/phrases typical of the First Lady (e.g. children, my family)
    • Inconsistent: more commas
    • Inconsistent: words/phrases atypical of the First Lady (e.g.working overtime, working hard, American people)
  • Is the questioned tweet consistent, or inconsistent with other tweets supposedly by Trump?
    • Consistent: more commas
    • Consistent: words/phrases typical of Trump (e.g. working overtime, working hard, American people)
    • Inconsistent: with variant that Trump does not appear to use based on this dataset
    • Inconsistent: words/phrases atypical of Trump (e.g. children, my family)

Overall then, the answer to both questions is: yes, and no.

All playing around aside, it’s crucial to note that the undercurrent to this whole story is that a person could potentially be in danger. This was a toy analysis for the purposes of testing those theories that the First Lady did not write that tweet, but there is a serious side to it too. Unfortunately, the conclusion is inconclusive, and we remain in the dark about the author of the tweet. My final thought is that I hope Melania is safe and well, and that this was all one of those over-excited media frenzies that start to kick up as the temperatures soar and the news dries up.

Postscript

I undertook a new investigation of older tweets supposedly by the First Lady and found more interesting results. You can read that here.

[Edited 11:28, 02 June 2018: added new section about lexicon and updated rest accordingly.]

[Edited 13:01, 04 June 2018: added data download links and postscript.]

One thought on “Missing Melania: is the First Lady’s Twitter account being used by Trump?

  1. Pingback: Missing Melania II: has Trump used the First Lady’s Twitter accounts in the past? | Dr Claire Hardaker