Introduction to tardis

library(dplyr, warn.conflicts = FALSE)
library(tardis)

The value proposition

Most sentiment-analysis algorithms boil down to two things:

By prioritizing flexibility, transparency, and speed, tardis makes it fast and easy to analyze text with customisable dictionaries and rules.

This means you can use the right dictionary and rules for your context and study aims.

The problem

A sentiment-analysis algorithm is only as good as its dictionary and its rules.

But relying on any single dictionary can cause problems:

And similarly, standard approaches may have problems with their rules:

The proposal

Tardis aims to overcome these issues by following three principles:

And given the importance of online communication and large data sets, Tardis also meets the following requirements:

The algorithm in brief

Tardis first decomposes texts into tokens (words, emojis, or multi-word strings), which are scored based on any dictionary value, if they’re in ALL CAPS, and the three preceding tokens. Preceding negations like “not” will reverse and reduce a token’s score, and modifiers will either increase (e.g. “very”) or decrease (e.g. “slightly”) its score. Sentence scores are found by summing token scores, adjusting for punctuation, and mapping results to the range \((-1, 1)\) with a sigmoid function. Text scores are means of sentence scores. Each of these steps can be tweaked or disabled by user-supplied parameters. Tardis’s algorithm is inspired by other approaches, notably VADER, although it differs from this latter in three key respects: first, it is much more customisable; second, token score adjustments are all multiplicative, making the order of operations unimportant; and third, there are no special cases or exceptions, making the rules simpler and more intuitive.

Because R is a vectorized language, internally tardis creates several vectors of length \(n\) and stores them in a tbl_df data frame, where \(n\) is the number of tokens in the input texts, and then operates largely by adding and multiplying across these vectors. For example, if \(neg\) is the negation scaling factor, \(s_i\) is the vector of each token’s dictionary sentiment, and \(n_i\) is the number of negations in the tokens at indices \(i-1\), \(i-2\), and \(i-3\), then we can calculate the effect of negations as \(s_i * (-neg)^{n_i}\). The implementation makes heavy use of the package dplyr, although it also uses base R and custom C++ functions to increase performance.

In languages like Python or C++, the preceding algorithm could be efficiently implemented through a “moving window” approach that steps through each token sequentially and computes a score based on a function \(f(t_j,t_{j-1},t_{j-2},t_{j-3})\) of each token \(t_j\) and its three preceding tokens.

The default dictionaries

To be completed…

Some examples

Fixing false positives: Ed’s little bed

A simple children’s rhyme shows one pitfall of relying on a fixed dictionary. Here we see the sad story of Ed, whose bed is too small:

library(tardis)
library(dplyr)
library(knitr)

text <- c("This is not good.", 
          "This is not right.", 
          "My feet stick out of bed all night.", 
          "And when I pull them in, oh dear!", 
          "My feet stick out of bed up here!")

tardis::tardis(text) %>%
  dplyr::select(sentences, score) %>%
  knitr::kable()
sentences score
This is not good. -0.3453024
This is not right. 0.0000000
My feet stick out of bed all night. 0.0000000
And when I pull them in, oh dear! 0.4291202
My feet stick out of bed up here! 0.0000000

Tardis has correctly noted that “not good” is negative, but has incorrectly classified the fourth sentence as positive because it contains the affectionate term “dear.” To fix this, we can add a new row to our default dictionary classifying “oh dear” as a negative term.

custom_dictionary <- dplyr::add_row(tardis::dict_tardis_sentiment,
                                    token = "oh dear", score = -1)

tardis::tardis(text, dict_sentiments = custom_dictionary) %>%
  dplyr::select(sentences, score) %>%
  knitr::kable()
sentences score
This is not good. -0.3453024
This is not right. 0.0000000
My feet stick out of bed all night. 0.0000000
And when I pull them in, oh dear! -0.2846456
My feet stick out of bed up here! 0.0000000

Of course, our choice to assign “oh dear” a sentiment value of -1 was arbitrary, but with this change tardis correctly flags the fourth sentence as negative. This demonstrates how easy it is to adapt tardis’s dictionaries to a specific context.

Identifying sarcasm in online communications

Here are three two-sentence texts that have similarly neutral mean sentiments, but very different meanings.

text <- c("I guess so, that might be fine. I don't know.",
          "Wow, you're really smart. MORON!",
          "It's the worst idea I've ever heard 😘" )

tardis::tardis(text) %>%
  knitr::kable()
sentences score score_sd score_range
I guess so, that might be fine. I don’t know. 0.1011443 0.1430397 0.2022887
Wow, you’re really smart. MORON! 0.0767885 1.0030603 1.4185415
It’s the worst idea I’ve ever heard 😘 -0.0073832 0.8732911 1.2350202

Only the first sentence is genuinely neutral; the second two express two wildly different sentiments that on average are neutral, but to most human readers imply a strong emotional value. Tardis also returns the standard deviation and ranges of within-text sentence sentiments, and we can see that the ranges for the two sarcastic texts are much larger than for the truly neutral text. Of course, these examples are blunt and not particularly funny, but they show the use of looking beyond the mean when studying sentiment in informal online communications.

Simple counts

In some cases, researchers may have pre-built dictionaries and be interested in simply detecting those words, without worrying about any of the more complex rules described above. For this use case, tardis has a convenience parameter simple_count which, when TRUE, disables most of the logic and returns simple sums of token values. Tardis also sends the user a warning to confirm this is the expected behaviour.

For example:

dict_cats <- dplyr::tibble(token = c("cat", "cats"), score = c(1, 1))

text <- c("I love cats.", "Not a cat?!?!", "CATS CATS CATS!!!")

tardis::tardis(text, dict_sentiments = dict_cats, simple_count = TRUE) %>%
  dplyr::select(sentences, score) %>%
  knitr::kable()
#> Warning in tardis::tardis(text, dict_sentiments = dict_cats, simple_count =
#> TRUE): Parameter simple_count = TRUE overrides most other parameters. Make sure
#> this is intended!
sentences score
I love cats. 1
Not a cat?!?! 1
CATS CATS CATS!!! 3

Note that the column names are unchanged, although the interpretation differs.

The algorithm in excruciating detail

Once a text has been broken down into sentences and tokens, scores are built back up starting from the tokens.