Virtual talk given to the Berkelely Evaluation and Assessment Research Seminar. Slides available here.
Abstract: High stakes admissions testing is often carried out using items with pre-calibrated parameters. Though this approach often works well in practice, factors such as item order and time pressure can modify the testing context enough that pre-calibrated parameters are no longer valid. One example occurred during the 2014 administration of ENEM, the national Brazilian college entrance exam. Despite all students taking the same items, students exposed to one particular ordering performed the worst in math, implying that item order has an effect on student performance and the color of your booklet could potentialy determine whether or not you are able to attend college. Previous approaches that model position effects as variation in either item parameters or person abilities may make unreasonable homogeneity assumptions. To address this gap, we propose an item response model that treats position effects as both person-side and item-side by modeling heterogeneity in individual response processes of the course of the test. Here an individual’s encounter with an item is treated as a smoothly varying mixture of how the studentw ould interact with the item if encountered early in the test and how the student would interact with the item were it encountered late in the test, weighted by actual item positin. This directly models a difference in response processes for students who are “fresh” and “fatigued,” estimating ability net of individual endurance. This presentation will focus on the estimation and properties of this model derived from simulation and look at applications to the 2014 ENEM Math administration.