Function for fitting several sequential sampling confidence models in parallel
Source:R/fitRTConfModels.R
fitRTConfModels.Rd
This function is a wrapper of the function fitConfModel
(see
there for more information). It calls the function for every possible combination
of model and participant/subject in model
and data
respectively.
Also, see ddynaViTE
, d2DSD
, dDDConf
,
and dRM
for more
information about the parameters.
Arguments
- data
a
data.frame
where each row is one trial, containing following variables (column names can be changed by passing additional arguments of the formcondition="contrast"
):condition
(not necessary; for different levels of stimulus quality, will be transformed to a factor),rating
(discrete confidence judgments, should be given as integer vector; otherwise will be transformed to integer),rt
(giving the reaction times for the decision task),either 2 of the following (see details for more information about the accepted formats):
stimulus
(encoding the stimulus category in a binary choice task),response
(encoding the decision response),correct
(encoding whether the decision was correct; values in 0, 1)
sbj
alternativelysubject
orparticipant
(giving the subject ID; the models given in the second argument are fitted for each subject individually. (Furthermore, iflogging = TRUE
, the ID is used in files saved with interim results and logging messages.) The output data frame reused the name of the column in the input (i.e. the output contains asubject
column, if the input containssubject
instead ofsbj
).)
- models
character vector with following possible elements "dynWEV", "2DSD", "IRM", "PCRM", "IRMt", and "PCRMt" for the models to be fit.
- nRatings
integer. Number of rating categories. If
NULL
, the maximum ofrating
andlength(unique(rating))
is used. This argument is especially important for data sets where not the whole range of rating categories is realized. If given, ratings has to be given as factor or integer.- fixed
list. List with parameter value pairs for parameters that should not be fitted. (see Details).
- restr_tau
numerical or
Inf
or"simult_conf"
. Used for 2DSD and dynWEV only. Upper bound for tau. Fits will be in the interval (0,restr_tau
). IfFALSE
tau will be unbound. For"simult_conf"
, see the documentation ofd2DSD
andddynaViTE
- grid_search
logical. If
FALSE
, the grid search before the optimization algorithm is omitted. The fitting is then started with a mean parameter set from the default grid. (Default:TRUE
)- opts
list. A list for more control options in the optimization routines (depending on the
optim_method
). See details for more information.- optim_method
character. Determines which optimization function is used for the parameter estimation. Either
"bobyqa"
(default),"L-BFGS-B"
or"Nelder-Mead"
."bobyqa"
uses a box-constrained optimization with quadratic interpolation. (Seebobyqa
for more information.) The first two use a box-constraint optimization. For Nelder-Mead a transfinite function rescaling is used (i.e. the constrained arguments are suitably transformed to the whole real line).- logging
logical. If
TRUE
, a folder 'autosave/fitmodel' is created and messages about the process are printed in a logging file and to console (depending on OS). Additionally intermediate results are saved in a.RData
file with the participant/subject ID in the name.- precision
numerical numeric. Precision of calculation for the density functions (see
ddynaViTE
anddPCRM
for more information).- parallel
"models", "single", "both" or
FALSE
. IfFALSE
no parallelization is used in the fitting process. If "models" the fitting process is parallelized over participants and models (i.e. over the calls for fitting functions). If "single" parallelization is used within the fitting processes (over initial grid search and optimization processes for different start points, but seefitRTConf
). If "both", parallelization is done hierarchical. For small number of models and participants "single" or "both" is preferable. Otherwise, you may use "models".- n.cores
integer vector or
NULL
. Ifparallel
is "models" or "single", a single integer for the number of cores used for parallelization is required. Ifparallel
is "both", two values are required. The first for the number of parallel model-participant combinations and the second for the parallel processes within the fitting procedures (this may be specified to match thenAttemps
-Value in theopts
argument. IfNULL
(default) the number of available cores -1 is used. IfNULL
andparallel
is "both", the cores will be used for model-participant-parallelization, only.- ...
Possibility of giving alternative variable names in data frame (in the form
condition = "SOA"
, orresponse="pressedKey"
).
Value
Gives data frame with rows for each model-participant combination and columns for the different parameters
as fitted result as well as additional information about the fit (negLogLik
(for final parameters),
k
(number of parameters), N
(number of data rows), BIC
, AICc
and AIC
)
Details
The fitting involves a first grid search through an initial grid. Then the best nAttempts
parameter sets are chosen for an optimization, which is done with an algorithm, depending on the argument
optim-method
. The Nelder-Mead algorithm uses the R function optim
.
The optimization routine is restarted nRestarts
times with the starting parameter set equal to the
best parameters from the previous routine.
stimulus, response and correct. Two of these columns must be given in data. If all three are given, correct will have no effect (and will be not checked!). stimulus can always be given in numerical format with values -1 and 1. response can always be given as a character vector with "lower" and "upper" as values. Correct must always be given as a 0-1-vector. If stimulus is given together with response and they both do not match the above format, they need to have the same values/levels (if factor). In the case that only stimulus/response is given in any other format together with correct, the unique values will be sorted increasingly and the first value will be encoded as "lower"/-1 and the second as "upper"/+1.
fixed. Parameters that should not be fitted but kept constant. These will be dropped from the initial grid search
but will be present in the output, to keep all parameters for prediction in the result. Includes the
possibility for symmetric confidence thresholds for both alternative (sym_thetas
=logical). Other examples are
z =.5
, sv=0
, st0=0
, sz=0
. For race models, the possibility of setting a='b'
(or vice versa)
leads to identical upper bounds on the decision processes, which is the equivalence for z=.5
in a diffusion process
opts. A list with numerical values. Possible options are listed below (together with the optimization method they are used for).
nAttempts
(all) number of best performing initial parameter sets used for optimization; default 5nRestarts
(all) number of successiveoptim
routines for each of the starting parameter sets; default 5,maxfun
('bobyqa'
) maximum number of function evaluations; default: 5000,maxit
('Nelder-Mead' and 'L-BFGS-B'
) maximum iterations; default: 2000,reltol
('Nelder-Mead'
) relative tolerance; default: 1e-6),factr
('L-BFGS-B'
) tolerance in terms of reduction factor of the objective, default: 1e-10)
References
Hellmann, S., Zehetleitner, M., & Rausch, M. (2023). Simultaneous modeling of choice, confidence and response time in visual perception. Psychological Review 2023 Mar 13. doi: 10.1037/rev0000411. Epub ahead of print. PMID: 36913292.
Author
Sebastian Hellmann, sebastian.hellmann@ku.de
Examples
# 1. Generate data from two artificial participants
# Get random drift direction (i.e. stimulus category) and
# stimulus discriminability (two steps: hard, easy)
stimulus <- sample(c(-1, 1), 400, replace=TRUE)
discriminability <- sample(c(1, 2), 400, replace=TRUE)
# generate data for participant 1
data <- rdynaViTE(400, a=2, v=stimulus*discriminability*0.5,
t0=0.2, z=0.5, sz=0.1, sv=0.1, st0=0, tau=4, s=1, w=0.3)
# discretize confidence ratings (only 2 steps: unsure vs. sure)
data$rating <- as.numeric(cut(data$conf, breaks = c(-Inf, 1, Inf), include.lowest = TRUE))
data$participant = 1
data$stimulus <- stimulus
data$discriminability <- discriminability
# generate data for participant 2
data2 <- rdynaViTE(400, a=2.5, v=stimulus*discriminability*0.7,
t0=0.1, z=0.7, sz=0, sv=0.2, st0=0, tau=2, s=1, w=0.5)
data2$rating <- as.numeric(cut(data$conf, breaks = c(-Inf, 0.3, Inf), include.lowest = TRUE))
data2$participant = 2
data2$stimulus <- stimulus
data2$discriminability <- discriminability
# bind data from participants
data <- rbind(data, data2)
data <- data[data$response!=0, ] # drop not finished decision processes
data <- data[,-3] # drop conf measure (unobservable variable)
head(data)
#> rt response rating participant stimulus discriminability
#> 1 1.61 1 1 1 1 1
#> 2 0.80 -1 2 1 -1 2
#> 3 0.52 -1 1 1 -1 1
#> 4 0.99 1 2 1 -1 1
#> 5 0.94 -1 2 1 -1 2
#> 6 1.35 -1 2 1 -1 2
# 2. Use fitting function
if (FALSE) { # \dontrun{
# Fitting takes very long to run and uses multiple (6) cores with this
# call:
fitRTConfModels(data, models=c("dynWEV", "PCRM"), nRatings = 2,
logging=FALSE, parallel="both",
n.cores = c(2,3), # fit two participant-model combination in parallel
condition="discriminability")# tell which column is "condition"
} # }