Skip to contents

Extends fit_mixed_subjects() by iterating the E-step and M-step until convergence rather than fixing posterior quadrature weights at the initial parameter estimates. At every iteration the posterior weights for all three datasets (observed, predicted, generated) are recomputed using the same current item parameters. This keeps the posteriors internally consistent and avoids the asymmetry between L_pred and L_gen that arises when frozen human-MLE weights are applied to LLM data with different item parameters.

Usage

fit_mixed_subjects_iterative(
  observed,
  predicted,
  generated,
  lambda = 1,
  n_quad = 31,
  initial_pars = NULL,
  quadrature = NULL,
  common_predicted_weights = TRUE,
  paired_missing = c("match_observed", "allow"),
  slope_lower = 1e-04,
  slope_upper = NULL,
  tol = 1e-04,
  em_maxit = 30,
  control = list(maxit = 200),
  ...
)

Arguments

observed

Human response matrix, with rows for subjects and columns for items. Values must be binary when initial_pars is omitted.

predicted

Binary LLM responses (0/1) for the same rows and items as observed. Probabilities are not accepted: fractional values are not a valid likelihood input for the marginal IRT objective and break the PPI correction, so sample binary responses from any probabilities first (e.g. rbinom).

generated

Binary generated or unlabeled LLM responses (0/1) for the same item columns. Probabilities are not accepted (see predicted).

lambda

Power-tuning parameter in [0, 1].

n_quad

Number of standard-normal quadrature nodes.

initial_pars

Optional starting item parameters. If omitted, a 2PL model is fit to observed.

quadrature

Optional quadrature grid with theta and weight columns.

common_predicted_weights

Logical; if TRUE, reuse the observed human posterior weights for predicted.

paired_missing

How to handle missingness when common_predicted_weights = TRUE. The default, "match_observed", requires observed and predicted to have the same missingness pattern so the paired LLM correction is evaluated only where a human label is present. Use "allow" only for explicit sensitivity analyses.

slope_lower

Lower bound for discrimination parameters during optimization. Use NULL for no lower bound.

slope_upper

Upper bound on discrimination parameters. Strongly recommended when lambda > 0 — the iterative EM updates posteriors at each step, and without an upper bound the gradient asymmetry between L_pred and L_gen can compound across iterations, driving discrimination estimates to extreme values. A typical choice is slope_upper = 4 or slope_upper = 6.

tol

Convergence tolerance: maximum absolute change in any parameter across an EM iteration.

em_maxit

Maximum number of EM iterations.

control

Control list passed to stats::optim().

...

Additional arguments passed to fit_2pl() when initial_pars is omitted.

Value

An object of class "mixedsubjects_fit" with the standard fields plus em_iterations (number of EM cycles completed) and em_converged (logical).

Details

Note on lambda selection. This function accepts a fixed lambda. For psychometric applications where accurate ability scoring is the goal, select lambda with tune_lambda_ability_risk() rather than tune_lambda_ppi_score(). The PPI++ score objective minimizes the trace of the item-parameter covariance matrix; tune_lambda_ability_risk() minimizes the propagated ability-score risk g' Sigma g, which is the quantity that matters for downstream test scoring.

Examples

set.seed(1)
pars <- data.frame(a = c(1, 1.2, 0.9), d = c(0, -0.5, 0.3))
observed <- simulate_2pl(rnorm(40), pars)
predicted <- observed
generated <- simulate_2pl(rnorm(100), pars)
fit <- fit_mixed_subjects_iterative(
  observed, predicted, generated,
  lambda = 0.5, initial_pars = pars, n_quad = 7,
  control = list(maxit = 50), em_maxit = 5
)
#> Warning: fit_mixed_subjects_iterative() with lambda > 0 and no slope_upper can diverge to extreme discrimination values. Setting slope_upper (e.g. slope_upper = 6) is strongly recommended.
fit$item_pars
#>   item         a          d          b
#> 1    1 1.2132598 -0.2037054  0.1678992
#> 2    2 1.3220410 -0.9830630  0.7435949
#> 3    3 0.5810652  0.1202592 -0.2069634