Sets up control object for linear or nonlinear modeling of a response variable onto a large panel of
textual sentiment measures (and potentially other variables). See `sento_model`

for details on the
estimation and calibration procedure.

ctr_model(
model = c("gaussian", "binomial", "multinomial"),
type = c("BIC", "AIC", "Cp", "cv"),
do.intercept = TRUE,
do.iter = FALSE,
h = 0,
oos = 0,
do.difference = FALSE,
alphas = seq(0, 1, by = 0.2),
lambdas = NULL,
nSample = NULL,
trainWindow = NULL,
testWindow = NULL,
start = 1,
do.shrinkage.x = FALSE,
do.progress = TRUE,
nCore = 1
)

## Arguments

model |
a `character` vector with one of the following: `"gaussian"` (linear regression), `"binomial"`
(binomial logistic regression), or `"multinomial"` (multinomial logistic regression). |

type |
a `character` vector indicating which model calibration approach to use. Supports "`BIC` ",
"`AIC` " and "`Cp` " (Mallows's Cp) as sparse regression adapted information criteria (Tibshirani and Taylor,
2012; Zou, Hastie and Tibshirani, 2007), and "`cv` " (cross-validation based on the `train`
function from the caret package). The adapted information criteria are only available for a linear regression. |

do.intercept |
a `logical` , `TRUE` by default fits an intercept. |

do.iter |
a `logical` , `TRUE` induces an iterative estimation of models at the given `nSample` size and
performs the associated out-of-sample prediction exercise through time. |

h |
an `integer` value that shifts the time series to have the desired prediction setup; `h = 0` means
no change to the input data (nowcasting assuming data is aligned properly), `h > 0` shifts the dependent variable by
`h` periods (i.e., rows) further in time (forecasting), `h < 0` shifts the independent variables by `h`
periods. |

oos |
a non-negative `integer` to indicate the number of periods to skip from the end of the training sample
up to the out-of-sample prediction(s). This is either used in the cross-validation based calibration approach
(if `type = ` "`cv` "), or for the iterative out-of-sample prediction analysis (if `do.iter = TRUE` ). For
instance, given \(t\), the (first) out-of-sample prediction is computed at \(t +\) `oos` \(+ 1\). |

do.difference |
a `logical` , `TRUE` will difference the target variable `y` supplied in the
`sento_model` function with as lag the absolute value of the `h` argument, but
`abs(h) > 0` is required. For example, if `h = 2` , and assuming the `y` variable is properly aligned
date-wise with the explanatory variables denoted by \(X\) (the sentiment measures and other in `x` ), the regression
will be of \(y_{t + 2} - y_t\) on \(X_t\). If `h = -2` , the regression fitted is \(y_{t + 2} - y_t\) on
\(X_{t+2}\). The argument is always kept at `FALSE` if the `model` argument is one of
`c("binomial", "multinomial")` . |

alphas |
a `numeric` vector of the alphas to test for during calibration, between 0 and 1. A value of
0 pertains to Ridge regression, a value of 1 to LASSO regression; values in between are pure elastic net. |

lambdas |
a `numeric` vector of the lambdas to test for during calibration, \(>= 0\).
A value of zero means no regularization, thus requires care when the data is fat. By default set to
`NULL` , such that the lambdas sequence is generated by the `glmnet` function
or set to `10^seq(2, -2, length.out = 100)` in case of cross-validation. |

nSample |
a positive `integer` as the size of the sample for model estimation at every iteration (ignored if
`do.iter = FALSE` ). |

trainWindow |
a positive `integer` as the size of the training sample for cross-validation (ignored if
`type != ` "`cv` "). |

testWindow |
a positive `integer` as the size of the test sample for cross-validation (ignored if `type != `
"`cv` "). |

start |
a positive `integer` to indicate at which point the iteration has to start (ignored if
`do.iter = FALSE` ). For example, given 100 possible iterations, `start = 70` leads to model estimations
only for the last 31 samples. |

do.shrinkage.x |
a `logical` vector to indicate which of the other regressors provided through the `x`
argument of the `sento_model` function should be subject to shrinkage (`TRUE` ). If argument is of
length one, it applies to all external regressors. |

do.progress |
a `logical` , if `TRUE` progress statements are displayed during model calibration. |

nCore |
a positive `integer` to indicate the number of cores to use for a parallel iterative model
estimation (`do.iter = TRUE` ). We use the `%dopar%` construct from the foreach package. By default,
`nCore = 1` , which implies no parallelization. No progress statements are displayed whatsoever when `nCore > 1` .
For cross-validation models, parallelization can also be carried out for a single-shot model (`do.iter = FALSE` ),
whenever a parallel backend is set up. See the examples in `sento_model` . |

## Value

A `list`

encapsulating the control parameters.

## References

Tibshirani and Taylor (2012). **Degrees of freedom in LASSO problems**.
*The Annals of Statistics 40, 1198-1232*, https://doi.org/10.1214/12-AOS1003.

Zou, Hastie and Tibshirani (2007). **On the degrees of freedom of the LASSO**.
*The Annals of Statistics 35, 2173-2192*, https://doi.org/10.1214/009053607000000127.

## See also

## Examples