The Maths Garden update algorithm is a gradient-based parameter estimation method specifically designed for computer adaptive testing systems. This algorithm was developed for the Maths Garden platform, an educational game that adapts mathematical problems to individual student abilities in real-time.
In this vignette, we will explore the mathematical foundations of the Maths Garden algorithm, its implementation in the meow
framework, and how to use and customize it for your research.
Mathematical Foundation
Core Update Equations
The Maths Garden algorithm implements simultaneous updates for both person abilities () and item difficulties () using gradient-based learning. The update equations are:
Person Ability Update:
Item Difficulty Update:
Where: - is the ability of person - is the difficulty of item - is the observed response (0 or 1) of person to item - is the expected probability of a correct response - and are learning rates for abilities and difficulties respectively - is the set of items responded to by person - is the set of persons who responded to item
Expected Response Probability
The expected probability is calculated using the logistic function:
This is the 1-parameter logistic (1PL) item response model, also known as the Rasch model.
Intuition Behind the Updates
Person Ability Updates: If a person performs better than expected (more correct responses than predicted), their ability estimate increases. If they perform worse than expected, their ability estimate decreases.
Item Difficulty Updates: If an item is answered correctly more often than expected, its difficulty decreases (becomes easier). If it’s answered correctly less often than expected, its difficulty increases (becomes harder).
Learning Rates: The parameters control how quickly the estimates change. Larger values lead to faster adaptation but may cause instability.
Implementation in meow
Function Overview
The update_maths_garden()
function implements this algorithm:
update_maths_garden <- function(theta, diff, resp) {
# Calculate expected probabilities using logistic function
E_Sij <- stats::plogis(theta[resp$id] - diff[resp$item])
# Learning rates (K values) - these could be tuned
K_theta <- 0.1 # Learning rate for ability
K_beta <- 0.1 # Learning rate for difficulty
# Update theta (ability) for each person
theta_updated <- theta
for (j in unique(resp$id)) {
# Get responses for person j
resp_j <- resp[resp$id == j, ]
# Calculate update term
update_term <- K_theta * sum(resp_j$resp - E_Sij[resp$id == j])
theta_updated[j] <- theta[j] + update_term
}
# Update beta (difficulty) for each item
beta_updated <- diff
for (i in unique(resp$item)) {
# Get responses for item i
resp_i <- resp[resp$item == i, ]
# Calculate update term
update_term <- K_beta * sum(E_Sij[resp$item == i] - resp_i$resp)
beta_updated[i] <- diff[i] + update_term
}
out <- list(
theta_est = theta_updated,
diff_est = beta_updated,
resp_cur = resp
)
return(out)
}
Step-by-Step Breakdown
1. Expected Probability Calculation
E_Sij <- stats::plogis(theta[resp$id] - diff[resp$item])
This calculates the expected probability of a correct response for each person-item combination using the logistic function.
2. Person Ability Updates
for (j in unique(resp$id)) {
resp_j <- resp[resp$id == j, ]
update_term <- K_theta * sum(resp_j$resp - E_Sij[resp$id == j])
theta_updated[j] <- theta[j] + update_term
}
For each person, this: - Collects all responses from that person - Calculates the sum of prediction errors (observed - expected) - Multiplies by the learning rate - Updates the ability estimate
3. Item Difficulty Updates
for (i in unique(resp$item)) {
resp_i <- resp[resp$item == i, ]
update_term <- K_beta * sum(E_Sij[resp$item == i] - resp_i$resp)
beta_updated[i] <- diff[i] + update_term
}
For each item, this: - Collects all responses to that item - Calculates the sum of prediction errors (expected - observed) - Multiplies by the learning rate - Updates the difficulty estimate
Using the Maths Garden Algorithm
Basic Usage
Customizing Learning Rates
You can modify the learning rates by creating a wrapper function:
update_maths_garden_custom <- function(theta, diff, resp, K_theta = 0.05, K_beta = 0.05) {
# Calculate expected probabilities using logistic function
E_Sij <- stats::plogis(theta[resp$id] - diff[resp$item])
# Update theta (ability) for each person
theta_updated <- theta
for (j in unique(resp$id)) {
resp_j <- resp[resp$id == j, ]
update_term <- K_theta * sum(resp_j$resp - E_Sij[resp$id == j])
theta_updated[j] <- theta[j] + update_term
}
# Update beta (difficulty) for each item
beta_updated <- diff
for (i in unique(resp$item)) {
resp_i <- resp[resp$item == i, ]
update_term <- K_beta * sum(E_Sij[resp$item == i] - resp_i$resp)
beta_updated[i] <- diff[i] + update_term
}
out <- list(
theta_est = theta_updated,
diff_est = beta_updated,
resp_cur = resp
)
return(out)
}
# Use with custom learning rates
results <- meow(
select_fun = select_max_info,
update_fun = update_maths_garden_custom,
data_loader = data_simple_1pl,
update_args = list(K_theta = 0.05, K_beta = 0.05),
data_args = list(N_persons = 100, N_items = 50)
)
Advantages and Limitations
Advantages
- Simplicity: The algorithm is straightforward to implement and understand
- Real-time Updates: Both person and item parameters are updated simultaneously
- No Iteration Required: Updates are computed directly without iterative optimization
- Computational Efficiency: Fast computation suitable for real-time applications
- Interpretable: The update rules have clear intuitive meaning
Limitations
- Fixed Learning Rates: The algorithm uses constant learning rates that don’t adapt to data
- No Uncertainty Quantification: Doesn’t provide confidence intervals or standard errors
- Potential Instability: Large learning rates can cause parameter estimates to oscillate
- Assumes 1PL Model: Based on the Rasch model, may not fit data requiring 2PL or 3PL models
- No Prior Information: Doesn’t incorporate prior knowledge about parameters
Extensions and Modifications
Adaptive Learning Rates
You can implement adaptive learning rates that change based on the amount of data:
update_maths_garden_adaptive <- function(theta, diff, resp) {
E_Sij <- stats::plogis(theta[resp$id] - diff[resp$item])
# Adaptive learning rates based on number of responses
for (j in unique(resp$id)) {
resp_j <- resp[resp$id == j, ]
n_responses <- nrow(resp_j)
# Decrease learning rate with more responses
K_theta_adaptive <- 0.1 / (1 + n_responses * 0.1)
update_term <- K_theta_adaptive * sum(resp_j$resp - E_Sij[resp$id == j])
theta[j] <- theta[j] + update_term
}
for (i in unique(resp$item)) {
resp_i <- resp[resp$item == i, ]
n_responses <- nrow(resp_i)
# Decrease learning rate with more responses
K_beta_adaptive <- 0.1 / (1 + n_responses * 0.1)
update_term <- K_beta_adaptive * sum(E_Sij[resp$item == i] - resp_i$resp)
diff[i] <- diff[i] + update_term
}
out <- list(
theta_est = theta,
diff_est = diff,
resp_cur = resp
)
return(out)
}
Constrained Updates
You can add constraints to prevent extreme parameter values:
update_maths_garden_constrained <- function(theta, diff, resp,
theta_bounds = c(-4, 4),
diff_bounds = c(-4, 4)) {
E_Sij <- stats::plogis(theta[resp$id] - diff[resp$item])
# Update theta with constraints
for (j in unique(resp$id)) {
resp_j <- resp[resp$id == j, ]
update_term <- 0.1 * sum(resp_j$resp - E_Sij[resp$id == j])
theta[j] <- theta[j] + update_term
# Apply constraints
theta[j] <- max(theta_bounds[1], min(theta_bounds[2], theta[j]))
}
# Update diff with constraints
for (i in unique(resp$item)) {
resp_i <- resp[resp$item == i, ]
update_term <- 0.1 * sum(E_Sij[resp$item == i] - resp_i$resp)
diff[i] <- diff[i] + update_term
# Apply constraints
diff[i] <- max(diff_bounds[1], min(diff_bounds[2], diff[i]))
}
out <- list(
theta_est = theta,
diff_est = diff,
resp_cur = resp
)
return(out)
}
Best Practices
- Start with Small Learning Rates: Begin with and adjust based on results
- Monitor Convergence: Check if parameter estimates stabilize over iterations
- Use Constraints: Implement bounds on parameter values to prevent extreme estimates
- Consider Adaptive Rates: Use decreasing learning rates for more stable long-term estimates
- Validate Results: Compare with other estimation methods when possible
- Test with Different Data: Ensure the algorithm works well with your specific data characteristics
Example: Complete Workflow
# Load required packages
library(meow)
# Define custom Maths Garden function with adaptive rates
update_maths_garden_improved <- function(theta, diff, resp) {
E_Sij <- stats::plogis(theta[resp$id] - diff[resp$item])
# Adaptive learning rates
for (j in unique(resp$id)) {
resp_j <- resp[resp$id == j, ]
n_responses <- nrow(resp_j)
K_theta_adaptive <- 0.1 / (1 + n_responses * 0.05)
update_term <- K_theta_adaptive * sum(resp_j$resp - E_Sij[resp$id == j])
theta[j] <- theta[j] + update_term
theta[j] <- max(-4, min(4, theta[j])) # Constraints
}
for (i in unique(resp$item)) {
resp_i <- resp[resp$item == i, ]
n_responses <- nrow(resp_i)
K_beta_adaptive <- 0.1 / (1 + n_responses * 0.05)
update_term <- K_beta_adaptive * sum(E_Sij[resp$item == i] - resp_i$resp)
diff[i] <- diff[i] + update_term
diff[i] <- max(-4, min(4, diff[i])) # Constraints
}
out <- list(
theta_est = theta,
diff_est = diff,
resp_cur = resp
)
return(out)
}
# Run simulation
results <- meow(
select_fun = select_max_info,
update_fun = update_maths_garden_improved,
data_loader = data_simple_1pl,
data_args = list(N_persons = 100, N_items = 50, data_seed = 123)
)
# Analyze results
print(head(results$results))