Convert a sentiment table to a sentiment object — as.sentiment • sentometrics

Converts a properly structured sentiment table into a sentiment object, that can be used for further aggregation with the aggregate.sentiment function. This allows to start from sentiment scores not necessarily computed with compute_sentiment.

as.sentiment(s)

Arguments

s: a data.table or data.frame that can be converted into a sentiment object. It should have at least an "id", a "date", a "word_count" and one sentiment scores column. If other column names are provided with a separating "--", the first part is considered the lexicon (or more generally, the sentiment computation method), and the second part the feature. For sentiment column names without any "--", a "dummyFeature" component is added.

Value

A sentiment object.

Author

Samuel Borms

Examples

set.seed(505)

data("usnews", package = "sentometrics")
data("list_lexicons", package = "sentometrics")

ids <- paste0("id", 1:200)
dates <- sample(seq(as.Date("2015-01-01"), as.Date("2018-01-01"), by = "day"), 200, TRUE)
word_count <- sample(150:850, 200, replace = TRUE)
sent <- matrix(rnorm(200 * 8), nrow =  200)
s1 <- s2 <- data.table::data.table(id = ids, date = dates, word_count = word_count, sent)
s3 <- data.frame(id = ids, date = dates, word_count = word_count, sent,
                 stringsAsFactors = FALSE)
s4 <- compute_sentiment(usnews$texts[201:400],
                        sento_lexicons(list_lexicons["GI_en"]),
                        "counts", do.sentence = TRUE)
m <- "method"

colnames(s1)[-c(1:3)] <- paste0(m, 1:8)
sent1 <- as.sentiment(s1)

colnames(s2)[-c(1:3)] <- c(paste0(m, 1:4, "--", "feat1"), paste0(m, 1:4, "--", "feat2"))
sent2 <- as.sentiment(s2)

colnames(s3)[-c(1:3)] <- c(paste0(m, 1:3, "--", "feat1"), paste0(m, 1:3, "--", "feat2"),
                           paste0(m, 4:5))
sent3 <- as.sentiment(s3)

s4[, "date" := rep(dates, s4[, max(sentence_id), by = id][[2]])]
#>            id sentence_id word_count GI_en       date
#>        <char>       <int>      <num> <num>     <Date>
#>    1:   text1           1          4     0 2016-02-05
#>    2:   text1           2         30     0 2016-02-05
#>    3:   text1           3         12    -1 2016-02-05
#>    4:   text1           4         41     0 2016-02-05
#>    5:   text1           5         34     0 2016-02-05
#>   ---                                                
#> 2226: text200           8         22     1 2015-04-07
#> 2227: text200           9         17     1 2015-04-07
#> 2228: text200          10         28     2 2015-04-07
#> 2229: text200          11         20     0 2015-04-07
#> 2230: text200          12          7     1 2015-04-07
sent4 <- as.sentiment(s4)

# further aggregation from then on is easy...
sentMeas1 <- aggregate(sent1, ctr_agg(lag = 10))
sent5 <- aggregate(sent4, ctr_agg(howDocs = "proportional"), do.full = FALSE)
#> The choice 'lag = 1' implies no time aggregation, so we added a dummy weighting scheme 'dummyTime'.