Wrapper function which assembles calls to compute_sentiment
and aggregate
.
Serves as the most direct way towards a panel of textual sentiment measures as a sento_measures
object.
sento_measures(sento_corpus, lexicons, ctr)
a sento_corpus
object created with sento_corpus
.
a sentolexicons
object created with sento_lexicons
.
output from a ctr_agg
call.
A sento_measures
object, which is a list
containing:
a data.table
with a "date"
column and all textual sentiment measures as remaining columns.
a character
vector of the different features.
a character
vector of the different lexicons used.
a character
vector of the different time weighting schemes used.
a data.frame
with some elementary statistics (mean, standard deviation, maximum, minimum, and
average correlation with the other measures) for each individual sentiment measure. In all computations, NAs are
removed first.
the document-level sentiment scores data.table
with "date"
,
"word_count"
and lexicon-feature sentiment scores columns. The "date"
column has the
dates converted at the frequency for across-document aggregation. All zeros are replaced by NA
if ctr$docs$weightingParam$do.ignoreZeros = TRUE
.
a list
of document and time weights used in the attributions
function.
Serves further no direct purpose.
a list
encapsulating the control parameters.
As a general rule, neither the names of the features, lexicons or time weighting schemes may contain any `-' symbol.
data("usnews", package = "sentometrics")
data("list_lexicons", package = "sentometrics")
data("list_valence_shifters", package = "sentometrics")
# construct a sento_measures object to start with
corpus <- sento_corpus(corpusdf = usnews)
corpusSample <- quanteda::corpus_sample(corpus, size = 500)
l <- sento_lexicons(list_lexicons[c("LM_en", "HENRY_en")], list_valence_shifters[["en"]])
ctr <- ctr_agg(howWithin = "counts",
howDocs = "proportional",
howTime = c("equal_weight", "linear", "almon"),
by = "month",
lag = 3,
ordersAlm = 1:3,
do.inverseAlm = TRUE)
sento_measures <- sento_measures(corpusSample, l, ctr)
summary(sento_measures)
#> This sento_measures object contains 64 textual sentiment time series with 238 observations each (monthly).
#>
#> Following features are present: wsj wapo economy noneconomy
#> Following lexicons are used to calculate sentiment: LM_en HENRY_en
#> Following scheme is applied for aggregation within documents: counts
#> Following scheme is applied for aggregation across documents: proportional
#> Following schemes are applied for aggregation across time: equal_weight linear almon1 almon1_inv almon2 almon2_inv almon3 almon3_inv
#>
#> Aggregate average statistics:
#> mean sd max min meanCorr
#> -0.34519 2.34385 6.92444 -7.73045 0.22638