This function extracts the documents with most extreme sentiment (lowest, highest or both in absolute terms). The extracted documents are unique, even when, for example, all most extreme sentiment values (across sentiment calculation methods) occur only for one document.
peakdocs(sentiment, n = 10, type = "both", do.average = FALSE)
a sentiment
object created using compute_sentiment
or
as.sentiment
.
a positive numeric
value to indicate the number of documents associated to sentiment
peaks to extract. If n < 1
, it is interpreted as a quantile (for example, 0.07 would mean the
7% most extreme documents).
a character
value, either "pos"
, "neg"
or "both"
, respectively to look
for the n
documents related to the most positive, most negative or most extreme (in absolute terms) sentiment
occurrences.
a logical
to indicate whether peaks should be selected based on the average sentiment
value per document.
A vector of type "character"
corresponding to the n
extracted document identifiers.
set.seed(505)
data("usnews", package = "sentometrics")
data("list_lexicons", package = "sentometrics")
data("list_valence_shifters", package = "sentometrics")
l <- sento_lexicons(list_lexicons[c("LM_en", "HENRY_en")])
corpus <- sento_corpus(corpusdf = usnews)
corpusSample <- quanteda::corpus_sample(corpus, size = 200)
sent <- compute_sentiment(corpusSample, l, how = "proportionalPol")
# extract the peaks
peaksAbs <- peakdocs(sent, n = 5)
peaksAbsQuantile <- peakdocs(sent, n = 0.50)
peaksPos <- peakdocs(sent, n = 5, type = "pos")
peaksNeg <- peakdocs(sent, n = 5, type = "neg")