Assessing Judging Bias in Large Reasoning Models: An Empirical Study

Assessing Judging Bias in Large Reasoning Models: An Empirical Study [99.9]
DeepSeek-R1やOpenAI-o1のような大きな推論モデル(LRM)は、顕著な推論能力を示している。本稿では、主観的嗜好アライメントデータセットと客観的事実ベースデータセットの両方において、LLMとLRMの偏りを判定するベンチマークを示す。
論文参考訳（メタデータ） (Mon, 14 Apr 2025 07:14:27 GMT)
LRMにおけるJudge時のバイアスに関する検証
基本的にLRMのJudgeに関する性能は高く「Through investigation of bandwagon, authority, position, and distraction biases, we uncover four key findings: (1) despite their advanced reasoning capabilities, LRMs remain susceptible to the above biases; (2) LRMs demonstrate better robustness than LLMs specifically on fact-related datasets; (3) LRMs exhibit notable position bias, preferring options in later positions; and (4) we identify a novel “superficial reflection bias” where phrases mimicking reasoning (e g , “wait, let me think…”) significantly influence model judgments.」とのこと。
「We identify a novel “superficial reflection bias” in LRMs, where phrases mimicking reasoning significantly influence judging outcomes, demonstrating how reasoning mechanisms can introduce new vulnerabilities in automated evaluation.」という点、おそらく学習過程によるものであろうということが興味深い。

コメントを残す

コメントを残す コメントをキャンセル