Quantitative Model Comparison — QuantModelComparison • dynConfiR

subject_modelweights computes the model weights (as probabilities) of individual subjects based on an information criterion (BIC, AIC, or AICc). group_BMS performs a Bayesian model comparison based on marginal likelihoods (alias model evidence), given for different models across different subject on a group level using a fixed effects model and a random effects model on the distribution of model probabilities (see Rigoux et al., 2014 and Details section). group_BMS_fits is a wrapper for group_BMS that can be used with the output of fitRTConfModels, i.e. a data frame with information criteria for different models and subjects, using a information criterion to approximate the model evidence.

Usage

subject_modelweights(fits, measure = "BIC")

group_BMS_fits(fits, measure = "BIC", opts = list(), alpha0 = NULL)

group_BMS(mlp, opts = list(), alpha0 = NULL)

Arguments

fits

a data frame as returned by fitRTConfModels. Should contain a column modelindicating the model name, a column subject (alternatively sbj or participant) indicating the grouping structure of the data, and a column with the name given by the measure argument containing the values of the information criterion that should be used to approximate model evidence.

measure

the name of the column indicating the information criterion to approximate model evidence. For outputs of fitRTConfModels, the available measures are 'BIC', 'AIC', and 'AICc'. Any other approximation for the model evidence may be used, the measure is transferred to log model evidence by taking -measure/2.

opts

a list with options for the iteration algorithm to estimate the parameter of the Dirichlet distribution. Following values may be provided:

maxiter the maximum number of iterations (Default: 200)
tol the tolerance for changes in the free energy approximation to stop the algorithm, if abs(FE(i+1)-FE(i))<tol the algorithm is stopped (Default: 1e-4)
eps The number to substitute values of 0 in calls to log (Default: 1e-32)

alpha0

a positive numeric vector representing the parameter of a Dirichlet distribution used as prior over model probabilities. The length should be equal to nrow(mlp) for group_BMS, and equal to the number of unique names in the model column of fits for group_BMS_fits.

mlp

a matrix containing the logarithm of marginal probabilities (i.e. log model evidence) with N columns representing individuals (or any other grouping structure) and K rows representing the models.

Value

subject_modelweights returns a data frame of subject-wise model probabilities with rows for each subject and columns for the models given by name and one column for the subject ID as given in the input. group_BMS and group_BMS_fits return a list with two entries:

model_weights: a matrix with rows for each model (row names indicate the model names for group_BMS_fits and for group_BMS if row names are available in mlp), and following columns: alpha (the alpha parameter of the Dirichlet posterior over model probabilities in the population), r (the mean probabilities of each model in the population), ep and pep (exceedance and protected exceedance probabilities for each model), and fx_prob (the posterior model probabilities if a fixed true model is assumed in the population).
summary_stats: a vector giving statistics for the Bayesian model comparison that may be used for other analyses: Bayesian omnibus risks: bor (random effects model against the null model), bor_fixed (fixed effects model against the null model), and bor_re_fixed (random effects model against the fixed effects model), and estimations of the Free Energy of the Dirichlet distribution FE (random effects model), FE0 (null model), and FEfixed (fixed effects model)

Details

This set of function can be used for model comparisons on a group level when the models were not fitted hierarchical but by fitting the models independently to different subgroups (e.g. data from different subjects).

The function subject_modelweights computes the model weights for each subject separately to inspect predominant models but also heterogeneity within the population. The functions group_BMS and group_BMS_fits can be used for a Bayesian model selection on the group level following the approach of Rigoux et al. (2014). The approach compares three different models for the generative structure of the data and gives estimates for model probabilities for the fixed and random effects models. The fixed effects model assumes that there is a single model that generated the data of all subjects. Thus, model weights may be computed directly by multiplying the model weights computed on a subject-level. This model is formulated in a Bayesian way using a Multinomial distribution over the models as prior with some prior parameter alpha0 giving the prior model weights. This is updated according to the marginal model likelihoods resulting in a single poterior vector of model probabilities, reported in the column fx_prob of the model_weights data frame. The random effects model assumes that there is a vector of model probabilities and each subject may generated by a different model, each drawn from a Multinomial distribution. The Bayesian prior on this vector of model probabilities is given by a Dirichlet distribution with some parameter alpha0. The function uses a variational technique to approximate the alpha parameter of the posterior Dirichlet distribution. Within this framework several statistics may be used for model selection. The model_weights data frame reports the posterior alpha parameter, as well as the posterior mean r of the corresponding dirichlet distribution. The exceedance probability ep represents the probability that given a random sample from the Dirichlet distribution the probability of the model is greater than all other probailities. Finally, the protected exceedance probability (pep) is a scaled version of the ep multiplying the ep by one minus the Bayesian omnibus risk (BOR). The Bayesian omnibus risk is the posterior probability of the null model against the random effects model. The null model assumes that all models are generating the subjects' data with equal probability and results from taking the limit of alpha0 towards infinity. The Bayesian omnibus risk is reported in the summary_stats together with the free energy approximation of the null, the fixed effects, and the random effects models.

References

Rigoux, L., Stephan, K. E., Friston, K. J., & Daunizeau, J. (2014). Bayesian model selection for group studies - revisited. NeuroImage, 84, 971–985. doi: 10.1016/j.neuroimage.2013.08.065

Author

Sebastian Hellmann.

Examples

# Define a data frame with information criteria from model fits
# (this is a sub-data.frame from an output of fitRTConfModels with
# 8 subjects, three models and rounded information criteria)
fits <- data.frame(
  participant = rep(1:8, each=3),
  model = rep(c("dynaViTE", "2DSD", "PCRMt"), 8),
  BIC = c(5318, 5665, 1659, 3856, 5508, 3982, 3950, 3998,
          4114, 4216, 4314, 4419, 3170, 3489, 3256, 1950,
          1934, 2051, 3194, 3317, 3359, 9656, 10161, 4024),
  AIC = c(5211, 5577, 1577, 3750, 5420, 3899, 3843, 3911,
          4031, 4109, 4226, 4337, 3063, 3401, 3173, 1844,
          1847, 1969, 3087, 3229, 3277, 9549, 10074, 3942),
  AICc = c(5212, 5578, 1577, 3751, 5421, 3900, 3844, 3911,
           4032, 4110, 4227, 4337, 3064, 3402, 3174, 1845,
           1848, 1970, 3088, 3230, 3277, 9550, 10074, 3942))
# Compute subject-wise model probabitilities based on different ICs
subject_modelweights(fits, measure = "BIC")
#>           2DSD        PCRMt     dynaViTE participant
#> 1 0.000000e+00 1.000000e+00 0.0000000000           1
#> 2 0.000000e+00 4.359610e-28 1.0000000000           2
#> 3 3.775135e-11 2.442601e-36 1.0000000000           3
#> 4 5.242886e-22 8.300611e-45 1.0000000000           4
#> 5 5.370691e-70 2.115131e-19 1.0000000000           5
#> 6 9.996646e-01 3.923080e-26 0.0003353501           6
#> 7 1.953842e-27 1.481512e-36 1.0000000000           7
#> 8 0.000000e+00 1.000000e+00 0.0000000000           8
subject_modelweights(fits, measure = "AIC")
#>           2DSD        PCRMt  dynaViTE participant
#> 1 0.000000e+00 1.000000e+00 0.0000000           1
#> 2 0.000000e+00 4.416326e-33 1.0000000           2
#> 3 1.713908e-15 1.500786e-41 1.0000000           3
#> 4 3.924396e-26 3.093350e-50 1.0000000           4
#> 5 4.020060e-74 1.299581e-24 1.0000000           5
#> 6 1.824255e-01 5.876547e-28 0.8175745           6
#> 7 1.462486e-31 5.521082e-42 1.0000000           7
#> 8 0.000000e+00 1.000000e+00 0.0000000           8
subject_modelweights(fits, measure = "AICc")
#>           2DSD        PCRMt  dynaViTE participant
#> 1 0.000000e+00 1.000000e+00 0.0000000           1
#> 2 0.000000e+00 4.416326e-33 1.0000000           2
#> 3 2.825757e-15 1.500786e-41 1.0000000           3
#> 4 3.924396e-26 5.100072e-50 1.0000000           4
#> 5 4.020060e-74 1.299581e-24 1.0000000           5
#> 6 1.824255e-01 5.876547e-28 0.8175745           6
#> 7 1.462486e-31 9.102726e-42 1.0000000           7
#> 8 0.000000e+00 1.000000e+00 0.0000000           8
# Conduct group-level Bayesian model selection based on BIC
group_BMS_fits(fits, measure="BIC")
#> $model_weights
#>             alpha         r     ep    pep fx_prob
#> 2DSD     1.998461 0.1816782 0.0274 0.0274       0
#> PCRMt    1.500000 0.1363636 0.0137 0.0137       1
#> dynaViTE 7.501539 0.6819581 0.9589 0.9589       0
#> 
#> $summary_stats
#>          bor           FE          FE0      FEfixed    bor_fixed bor_re_fixed 
#>        0.000     1715.178   -13010.289   -13433.099        1.000        0.000 
#> 


## General group-level Bayesian model selection based on any marginal log-probabilities
# Compute marginal log-likelihood based on BIC from fits
mlp <- matrix(NA, ncol=8, nrow=3)
for (i in 1:8) mlp[,i] <- fits[(i-1)*3 + 1:3, "BIC"]
mlp <- - mlp/(2)
rownames(mlp) <- c("dynaViTE", "2DSD", "PCRMt")
# conduct group BMS:
group_BMS(mlp)
#> $model_weights
#>             alpha         r     ep    pep fx_prob
#> dynaViTE 7.501539 0.6819581 0.9638 0.9638       0
#> 2DSD     1.998461 0.1816782 0.0250 0.0250       0
#> PCRMt    1.500000 0.1363636 0.0112 0.0112       1
#> 
#> $summary_stats
#>          bor           FE          FE0      FEfixed    bor_fixed bor_re_fixed 
#>        0.000     1715.178   -13010.289   -13433.099        1.000        0.000 
#>