2025年4月18日 – arXiv最新論文の紹介

Self-Steering Language Models [114.0]
DisCIPLは、”セルフステアリング(self-steering)”言語モデルのメソッドである。 DisCIPLはPlannerモデルを使用してタスク固有の推論プログラムを生成する。我々の研究は、高度に並列化されたモンテカルロ推論戦略の設計空間を開く。
論文参考訳（メタデータ） (Wed, 09 Apr 2025 17:54:22 GMT)
「This paper introduces DISCIPL, a method for “self-steering” LMs where a Planner model generates a task-specific inference program that is executed by a population of Follower models.」というアプローチの紹介。
「By decomposing reasoning into planning and execution, our architecture preserves flexibility while enabling orchestration of highly efficient, parallel search patterns.」というのは経験的にも有効そうに思う。検証がしっかりされているのはありがたい。

Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models [51.9]
大規模言語モデル(LLM)の最近の進歩は、複雑な推論タスクを実行する能力を大幅に強化している。システム1推論は計算効率が良いが、最適以下の性能をもたらす。システム2推論(System 2 reasoning)は、思考の遅さや非効率性、不必要な推論の振る舞いにより、かなりの計算コストを発生させることが多い。
論文参考訳（メタデータ） (Mon, 31 Mar 2025 17:58:07 GMT)
「In this survey, we provide a comprehensive analysis of reasoning economy in both the post-training and test-time inference stages of LLMs, encompassing」というサーベイ。
リポジトリはGitHub – DevoAllen/Awesome-Reasoning-Economy-Papers: Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models