SentoResearch

Research material & open-source software by and for the community

In a nutshell

Within the growing and fascinating landscape at the frontier of text mining, sentiment analysis, and econometrics, the field sentometrics has emerged. Researchers in sentometrics investigate the transformation of qualitative sentiment embedded in textual data (and other alternative data sources) into quantitative sentiment variables, and their subsequent application in an econometric analysis of the relationships between sentiment and other variables.

Many researchers steer forward sentometrics by doing tremendous work across the domains of economics, finance, politics and beyond. The objective of this hub is to provide resources and open-source software to help the community of these researchers interact with each other and showcase their work, while also introducing those interested to enter the field.

This survey paper and the R package sentometrics are perfect starting points to dive into this exciting field.

Indices

Data library

EPU

Daily EPU Flanders, Wallonia, and Belgium updated daily from 2003 to today.

MCCC

Daily U.S. Media Climate Change Concerns Index from 2003 to 2018.

U.S Topical Economic Sentiment

Daily Topical U.S Economic Sentiment Indices from 1996 to 2016.

Posts

Media Climate Change Concerns Index

Many consider climate change as one of the biggest challenges of our times. However, there is disagreement on the magnitude of the climate change problem and how to solve it.

Daily Topical US Economic Sentiment Indices

The added advantage of text- and news-based measures as sources of information for forecasting and assessing the economy is significant. In a recent paper of ours, we present a general methodology, which constitutes the base of the sentometrics R package, to forecast economic variables from news data.

Software

*

caret

Miscellaneous functions for training and plotting classification and regression models.

glmnet

LASSO and elastic net regularized generalized linear models.

GWP

Sentiment lexicon calibration with the Generalized Word Power methodology.

NLTK

NLTK is a leading platform for building Python programs to work with human language data.

quanteda

A fast, flexible, and comprehensive framework for quantitative text analysis in R.

scikit-learn

Machine learning in Python.

SentimentAnalysis

Dictionary-based sentiment analysis.

sentometrics

An integrated framework for textual sentiment time series aggregation and prediction.

sentometrics.app

A Shiny interface to the R package sentometrics.

sentopics

Tools for estimating and analyzing various classes of sentiment/topic models.

spaCy

Industrial-strength natural language processing in Python.

STM

The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates.

TextBlob

TextBlob is a Python library for processing textual data.

textir

Inverse regression analysis of text.

tidytext

Text mining using tidy tools.

transformers

State-of-the-art Natural Language Processing for PyTorch and TensorFlow 2.0.

UDPipe

Natural language processing toolkit.

VADER

Sentiment analysis tool that is specifically attuned to sentiments expressed in social media.

Sponsors

HEC Montreal

Reseach professorship in Sentometrics

Team Up

Grant for academic research & industry collaboration

Contribute

You can contribute by submitting a resource using this form. Please include what type of ressources (index, post, software, publications) as well as a link to the ressource and we will get in touch!