Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction [26.4]
本稿では,多様な領域にまたがる20以上のモデルに対して,人間とAIの難易度を大規模に解析する。以上の結果から,モデルサイズのスケールアップが確実でない体系的不整合が明らかとなった。モデルが生徒の能力制限をシミュレートするのに苦労しているため,高い性能が正確な難易度推定を妨げている場合が多い。
論文参考訳（メタデータ） (Sun, 21 Dec 2025 20:41:36 GMT)
問題の難易度を予測させるタスクに関する研究。「This study demonstrates that Large Language Mod- els currently struggle to align with human percep- tion of difficulty despite their advanced problem- solving capabilities. We find that increasing model scale does not guarantee better alignment but rather fosters a machine consensus that systematically diverges from student reality.」知h上に興味深い結果。教育目的の利用で大きな課題になるのと同時に一般的な利用においても注意すべきものに思える。
リポジトリはGitHub – MingLiiii/Difficulty_Alignment: Can LLMs Estimate Student Struggles? Human-LLM Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

コメントを残す

コメントを残す コメントをキャンセル