sento_measures(sento_corpus, lexicons, ctr)
output from a
sento_measures object, which is a
data.table with a
"date" column and all textual sentiment measures as remaining columns.
character vector of the different features.
character vector of the different lexicons used.
character vector of the different time weighting schemes used.
data.frame with some elementary statistics (mean, standard deviation, maximum, minimum, and
average correlation with the other measures) for each individual sentiment measure. In all computations, NAs are
the document-level sentiment scores
"word_count" and lexicon-feature sentiment scores columns. The
"date" column has the
dates converted at the frequency for across-document aggregation. All zeros are replaced by
ctr$docs$weightingParam$do.ignoreZeros = TRUE.
list of document and time weights used in the
Serves further no direct purpose.
list encapsulating the control parameters.
As a general rule, neither the names of the features, lexicons or time weighting schemes may contain any `-' symbol.
data("usnews", package = "sentometrics") data("list_lexicons", package = "sentometrics") data("list_valence_shifters", package = "sentometrics") # construct a sento_measures object to start with corpus <- sento_corpus(corpusdf = usnews) corpusSample <- quanteda::corpus_sample(corpus, size = 500) l <- sento_lexicons(list_lexicons[c("LM_en", "HENRY_en")], list_valence_shifters[["en"]]) ctr <- ctr_agg(howWithin = "counts", howDocs = "proportional", howTime = c("equal_weight", "linear", "almon"), by = "month", lag = 3, ordersAlm = 1:3, do.inverseAlm = TRUE) sento_measures <- sento_measures(corpusSample, l, ctr) summary(sento_measures)#> This sento_measures object contains 64 textual sentiment time series with 238 observations each (monthly). #> #> Following features are present: wsj wapo economy noneconomy #> Following lexicons are used to calculate sentiment: LM_en HENRY_en #> Following scheme is applied for aggregation within documents: #> Following scheme is applied for aggregation across documents: #> Following schemes are applied for aggregation across time: equal_weight linear almon1 almon1_inv almon2 almon2_inv almon3 almon3_inv #> #> Aggregate average statistics: #> mean sd max min meanCorr #> -0.34518 2.34385 6.92447 -7.73045 0.22638