2025年4月11日 – arXiv最新論文の紹介

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond [88.6]
大規模推論モデル (LRM) は, 推論中におけるチェーン・オブ・ソート (CoT) の推論長を拡大することにより, 高い性能向上を示した。懸念が高まっているのは、過度に長い推論の痕跡を生み出す傾向にある。この非効率性は、トレーニング、推論、現実のデプロイメントに重大な課題をもたらす。
論文参考訳（メタデータ） (Thu, 27 Mar 2025 15:36:30 GMT)
「In this survey, we provide a comprehensive overview of recent efforts aimed at improving reasoning efficiency in LRMs, with a particular focus on the unique challenges that arise in this new paradigm.」というサーベイ。Fugu-MT 論文翻訳(概要): Stop Overthinking: A Survey on Efficient Reasoning for Large Language Modelsでも思ったが新たな手法→新たな課題→包括的サーベイという流れが極めて速い。
リポジトリはGitHub – XiaoYee/Awesome_Efficient_LRM_Reasoning: A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Measurement of LLM’s Philosophies of Human Nature [113.5]
大規模言語モデル(LLM)を対象とする標準化された心理尺度を設計する。現在のLSMは、人間に対する信頼の欠如を示す。本稿では,LLMが継続的に価値体系を最適化できるメンタルループ学習フレームワークを提案する。
論文参考訳（メタデータ） (Thu, 03 Apr 2025 06:22:19 GMT)
「Machinebased Philosophies of Human Nature Scale (M-PHNS)」とLLMの人間性に対する評価を行うツールの提案。「Most models exhibit varying degrees of negative tendencies, such as perceiving humans as untrustworthy, selfish, and volatile. These tendencies intensify as the intelligence level of the model increases. This phenomenon is consistent regardless of the model’s developer or whether the model is open-source.」という結果が面白い。これらを修正するフレームワークも提案しているが、これが良いのかは若干謎。
リポジトリはkodenii/M-PHNS · GitHub