Name
Predicting and Evaluating Item Responses Using Machine Learning, Text Embeddings, and LLMs
Description

This study explores how artificial intelligence can support assessment development by simulating student responses to field test items on a social-emotional learning assessment. We compare three methods for generating synthetic data: (1) machine learning models trained on actual student data, (2) models enhanced with semantic text embeddings, and (3) large language models (LLMs) prompted with student profiles. Using DESSA responses from 3,982 high school students, we evaluate each method’s accuracy and ability to replicate item parameters derived from real data. Preliminary results suggest machine learning models produce the most accurate and reliable synthetic responses, with LLMs showing promise but requiring refinement. This work highlights the potential for AI to reduce field test burdens while supporting construct validity and offering insights into tradeoffs between structured and generative modeling approaches in educational assessment.

Date & Time
Tuesday, March 3, 2026, 4:55 PM - 5:40 PM
Location Name
Strand 10 - 2nd Fl.