Fit a mixed-subjects 2PL calibration with iterative EM
Source:R/fit.R
fit_mixed_subjects_iterative.RdExtends fit_mixed_subjects() by iterating the E-step and M-step until
convergence rather than fixing posterior quadrature weights at the initial
parameter estimates. At every iteration the posterior weights for all three
datasets (observed, predicted, generated) are recomputed using the same
current item parameters. This keeps the posteriors internally consistent and
avoids the asymmetry between L_pred and L_gen that arises when frozen
human-MLE weights are applied to LLM data with different item parameters.
Usage
fit_mixed_subjects_iterative(
observed,
predicted,
generated,
lambda = 1,
n_quad = 31,
initial_pars = NULL,
quadrature = NULL,
common_predicted_weights = TRUE,
paired_missing = c("match_observed", "allow"),
slope_lower = 1e-04,
slope_upper = NULL,
tol = 1e-04,
em_maxit = 30,
control = list(maxit = 200),
...
)Arguments
- observed
Human response matrix, with rows for subjects and columns for items. Values must be binary when
initial_parsis omitted.- predicted
Binary LLM responses (0/1) for the same rows and items as
observed. Probabilities are not accepted: fractional values are not a valid likelihood input for the marginal IRT objective and break the PPI correction, so sample binary responses from any probabilities first (e.g.rbinom).- generated
Binary generated or unlabeled LLM responses (0/1) for the same item columns. Probabilities are not accepted (see
predicted).- lambda
Power-tuning parameter in
[0, 1].- n_quad
Number of standard-normal quadrature nodes.
- initial_pars
Optional starting item parameters. If omitted, a 2PL model is fit to
observed.- quadrature
Optional quadrature grid with
thetaandweightcolumns.- common_predicted_weights
Logical; if
TRUE, reuse the observed human posterior weights forpredicted.- paired_missing
How to handle missingness when
common_predicted_weights = TRUE. The default,"match_observed", requiresobservedandpredictedto have the same missingness pattern so the paired LLM correction is evaluated only where a human label is present. Use"allow"only for explicit sensitivity analyses.- slope_lower
Lower bound for discrimination parameters during optimization. Use
NULLfor no lower bound.- slope_upper
Upper bound on discrimination parameters. Strongly recommended when
lambda > 0— the iterative EM updates posteriors at each step, and without an upper bound the gradient asymmetry betweenL_predandL_gencan compound across iterations, driving discrimination estimates to extreme values. A typical choice isslope_upper = 4orslope_upper = 6.- tol
Convergence tolerance: maximum absolute change in any parameter across an EM iteration.
- em_maxit
Maximum number of EM iterations.
- control
Control list passed to
stats::optim().- ...
Additional arguments passed to
fit_2pl()wheninitial_parsis omitted.
Value
An object of class "mixedsubjects_fit" with the standard fields
plus em_iterations (number of EM cycles completed) and em_converged
(logical).
Details
Note on lambda selection. This function accepts a fixed lambda. For
psychometric applications where accurate ability scoring is the goal, select
lambda with tune_lambda_ability_risk() rather than tune_lambda_ppi_score().
The PPI++ score objective minimizes the trace of the item-parameter
covariance matrix; tune_lambda_ability_risk() minimizes the propagated
ability-score risk g' Sigma g, which is the quantity that matters for
downstream test scoring.
Examples
set.seed(1)
pars <- data.frame(a = c(1, 1.2, 0.9), d = c(0, -0.5, 0.3))
observed <- simulate_2pl(rnorm(40), pars)
predicted <- observed
generated <- simulate_2pl(rnorm(100), pars)
fit <- fit_mixed_subjects_iterative(
observed, predicted, generated,
lambda = 0.5, initial_pars = pars, n_quad = 7,
control = list(maxit = 50), em_maxit = 5
)
#> Warning: fit_mixed_subjects_iterative() with lambda > 0 and no slope_upper can diverge to extreme discrimination values. Setting slope_upper (e.g. slope_upper = 6) is strongly recommended.
fit$item_pars
#> item a d b
#> 1 1 1.2132598 -0.2037054 0.1678992
#> 2 2 1.3220410 -0.9830630 0.7435949
#> 3 3 0.5810652 0.1202592 -0.2069634