Converts a properly structured sentiment table into a sentiment
object, that can be used
for further aggregation with the aggregate.sentiment
function. This allows to start from
sentiment scores not necessarily computed with compute_sentiment
.
as.sentiment(s)
a data.table
or data.frame
that can be converted into a sentiment
object. It
should have at least an "id"
, a "date"
, a "word_count"
and one sentiment scores column.
If other column names are provided with a separating "--"
, the first part is considered the lexicon
(or more generally, the sentiment computation method), and the second part the feature. For sentiment column
names without any "--"
, a "dummyFeature"
component is added.
A sentiment
object.
set.seed(505)
data("usnews", package = "sentometrics")
data("list_lexicons", package = "sentometrics")
ids <- paste0("id", 1:200)
dates <- sample(seq(as.Date("2015-01-01"), as.Date("2018-01-01"), by = "day"), 200, TRUE)
word_count <- sample(150:850, 200, replace = TRUE)
sent <- matrix(rnorm(200 * 8), nrow = 200)
s1 <- s2 <- data.table::data.table(id = ids, date = dates, word_count = word_count, sent)
s3 <- data.frame(id = ids, date = dates, word_count = word_count, sent,
stringsAsFactors = FALSE)
s4 <- compute_sentiment(usnews$texts[201:400],
sento_lexicons(list_lexicons["GI_en"]),
"counts", do.sentence = TRUE)
m <- "method"
colnames(s1)[-c(1:3)] <- paste0(m, 1:8)
sent1 <- as.sentiment(s1)
colnames(s2)[-c(1:3)] <- c(paste0(m, 1:4, "--", "feat1"), paste0(m, 1:4, "--", "feat2"))
sent2 <- as.sentiment(s2)
colnames(s3)[-c(1:3)] <- c(paste0(m, 1:3, "--", "feat1"), paste0(m, 1:3, "--", "feat2"),
paste0(m, 4:5))
sent3 <- as.sentiment(s3)
s4[, "date" := rep(dates, s4[, max(sentence_id), by = id][[2]])]
#> id sentence_id word_count GI_en date
#> <char> <int> <num> <num> <Date>
#> 1: text1 1 4 0 2016-02-05
#> 2: text1 2 30 0 2016-02-05
#> 3: text1 3 12 -1 2016-02-05
#> 4: text1 4 41 0 2016-02-05
#> 5: text1 5 34 0 2016-02-05
#> ---
#> 2226: text200 8 22 1 2015-04-07
#> 2227: text200 9 17 1 2015-04-07
#> 2228: text200 10 28 2 2015-04-07
#> 2229: text200 11 20 0 2015-04-07
#> 2230: text200 12 7 1 2015-04-07
sent4 <- as.sentiment(s4)
# further aggregation from then on is easy...
sentMeas1 <- aggregate(sent1, ctr_agg(lag = 10))
sent5 <- aggregate(sent4, ctr_agg(howDocs = "proportional"), do.full = FALSE)
#> The choice 'lag = 1' implies no time aggregation, so we added a dummy weighting scheme 'dummyTime'.