Posts A Mathematical Analysis of Trump's Tweets
Post
Cancel

A Mathematical Analysis of Trump's Tweets

An interesting hypothesis has surfaced that states that every non-hyperbolic tweet from Donald Trump is from iPhone (his staff), and every hyperbolic tweet is from Android (from him).

For example, when Trump wishes the Olympic team good luck, it’s from an iPhone. When he’s insulting Iraq, it’s from an Android. Is this a legitimate pattern? Let’s do some math following David Robinson’s approach.

First, we retrieve the necessary data and clean it up:

1
2
3
4
5
6
7
setup_twitter_oauth(getOption("twitter_consumer_key"),
                    getOption("twitter_consumer_secret"),
                    getOption("twitter_access_token"),
                    getOption("twitter_access_token_secret"))

trump_tweets <- userTimeline("realDonaldTrump", n = 3200)
trump_tweets_df <- tbl_df(map_df(trump_tweets, as.data.frame))
1
2
3
4
5
6
library(tidyr)

tweets <- trump_tweets_df %>%
  select(id, statusSource, text, created) %>%
  extract(statusSource, "source", "Twitter for (.*?)<") %>%
  filter(source %in% c("iPhone", "Android"))

Overall, this includes 628 tweets from iPhone, and 762 tweets from Android.

First, let’s see if there’s any patterns in time of day.

1
2
3
4
5
6
7
8
9
10
11
12
library(lubridate)
library(scales)

tweets %>%
  count(source, hour = hour(with_tz(created, "EST"))) %>%
  mutate(percent = n / sum(n)) %>%
  ggplot(aes(hour, percent, color = source)) +
  geom_line() +
  scale_y_continuous(labels = percent_format()) +
  labs(x = "Hour of day (EST)",
       y = "% of tweets",
       color = "")

Trump on the Android does a lot more tweeting in the morning, while the campaign posts from the iPhone more in the afternoon and early evening.

Let’s do some sentiment analysis to see if we can back this up: are Trump’s tweets more negative on Android?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
sources <- tweet_words %>%
  group_by(source) %>%
  mutate(total_words = n()) %>%
  ungroup() %>%
  distinct(id, source, total_words)

by_source_sentiment <- tweet_words %>%
  inner_join(nrc, by = "word") %>%
  count(sentiment, id) %>%
  ungroup() %>%
  complete(sentiment, id, fill = list(n = 0)) %>%
  inner_join(sources) %>%
  group_by(source, sentiment, total_words) %>%
  summarize(words = sum(n)) %>%
  ungroup()

head(by_source_sentiment)
1
2
3
4
5
6
7
8
9
## # A tibble: 6 x 4
##    source    sentiment total_words words
##     <chr>        <chr>       <int> <dbl>
## 1 Android        anger        4901   321
## 2 Android anticipation        4901   256
## 3 Android      disgust        4901   207
## 4 Android         fear        4901   268
## 5 Android          joy        4901   199
## 6 Android     negative        4901   560
1
2
3
4
5
6
7
library(broom)

sentiment_differences <- by_source_sentiment %>%
  group_by(sentiment) %>%
  do(tidy(poisson.test(.$words, .$total_words)))

sentiment_differences

Let’s visualize the difference with a 95% confidence interval:

Trump’s Android account uses 40-80% more words related to disgust, sadness, fear, anger, and other “negative” sentiments than the iPhone account does. This looks pretty convincing.

Another key observation is that it seems much more likely for Trump’s iPhone tweets to have a picture or a link, which makes sense with an “announcement narrative” from his campaign. Let’s see if this true.

1
2
3
4
5
6
7
8
9
tweet_picture_counts <- tweets %>%
  filter(!str_detect(text, '^"')) %>%
  count(source,
        picture = ifelse(str_detect(text, "t.co"),
                         "Picture/link", "No picture/link"))

ggplot(tweet_picture_counts, aes(source, n, fill = picture)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(x = "", y = "Number of tweets", fill = "")

As it turns out, tweets from the iPhone are 38 times as likely to contain either a picture or a link than tweets from Android.

From time of day, sentiment, and tweet format, the argument that Trump’s own tweets are only from Android seem pretty convincing!

This post is licensed under CC BY 4.0 by the author.