2025年11月27日 – arXiv最新論文の紹介

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning [84.7]
大規模言語モデル(LLM)エージェントは、人間の計算データへの依存によって制約される。我々は,外部データを持たない高性能エージェントを進化させる完全自律型フレームワークであるAgent0を紹介する。 Agent0は推論能力を大幅に向上させ、Qwen3-8B-Baseモデルを数学的推論で18%改善し、一般的な推論ベンチマークで24%改善した。
論文参考訳（メタデータ） (Thu, 20 Nov 2025 05:01:57 GMT)
「we initialize two functionally distinct agents: an execu- tor agent and a curriculum agent. These agents co-evolve through a symbiotic competition: the curriculum agent is trained using RL (Shao et al , 2024) to propose frontier tasks that precisely challenge the executor’s current capabilities, using the executor’s uncertainty (i.e., self-consistency across multiple answers) and its frequency of tool use as reward signals. Concurrently, the executor agent is trained via RL to successfully solve these tasks, optimizing on a filtered set of challenging problems generated by the frozen curriculum agent and using pseudo-labels derived from its own majority voting. Equipping the executor with a tool enhances its problem-solving abilities, which in turn com- pels the tool-equipped curriculum agent to generate more complex, tool-based curricula.」という複数エージェントを活用した共進化なフレームワーク。Agent構築においても近いアプローチが流行っているように思う。
リポジトリはGitHub – aiming-lab/Agent0: [arXiv’25] Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

TiDAR: Think in Diffusion, Talk in Autoregression

TiDAR: Think in Diffusion, Talk in Autoregression [59.9]
TiDARは、Diffusionでトークン(Thinking)をドラフトし、最終的な出力(Talking)をAutoRegressivelyにサンプリングするシーケンスレベルのハイブリッドアーキテクチャである。 TiDARはARモデルと品質ギャップを埋める最初のアーキテクチャであり、毎秒4.71倍から5.91倍のトークンを提供する。
論文参考訳（メタデータ） (Thu, 13 Nov 2025 01:18:11 GMT)
Diffusion modelとAuto regressiveのハイブリッド「We introduce TiDAR, a sequence-level hybrid architecture that drafts tokens (Thinking) in Diffusion and samples final outputs (Talking) AutoRegressively – all within a single forward pass using specially designed structured attention masks.」
「We extensively evaluate TiDAR against AR models, speculative decoding, and diffusion variants across generative and likelihood tasks at 1.5B and 8B scales. Thanks to the parallel drafting and sampling as well as exact KV cache support, TiDAR outperforms speculative decoding in measured throughput and surpasses diffusion models like Dream and Llada in both efficiency and quality. Most notably, TiDAR is the first architecture to close the quality gap with AR models while delivering 4.71× to 5.91× more tokens per second.」とスケールすることが確認できているのがすごい。

Virtual Width Networks

Virtual Width Networks [130.5]
VWN(Virtual Width Networks)は,隠れたサイズを増大させることなく,より広い表現の利点を提供するフレームワークである。大規模実験では,8倍拡張により,次の2倍の2倍,次の2倍の3倍の2倍の最適化が可能となった。
論文参考訳（メタデータ） (Fri, 14 Nov 2025 12:41:57 GMT)
Transfomerに統合することが可能な改善の提案、「We introduced Virtual Width Networks (VWN) as a practical mechanism to decouple representational width from the quadratic compute typically associated with widening. With a modest 1.5× expansion, we observe consistent improvements. When scaling to 8× virtual width, optimization accelerates markedly: next-token prediction loss converges more than 2× faster and multi-token prediction loss more than 3× faster relative to the baseline width. Beyond these discrete points, the performance of VWN exhibits a clear scaling behavior.」、通信やメモリ部分での制約があるとのことだが、「In practice, virtual width expansions in the 1.5×–4× range are more feasible on today’s stacks,」という記載には期待が持てる。

10 Open Challenges Steering the Future of Vision-Language-Action Models

10 Open Challenges Steering the Future of Vision-Language-Action Models [57.8]
視覚言語アクション(VLA)モデルは、具体化されたAIアリーナでますます普及している。 VLAモデルの開発における10のマイルストーンについて論じる。
論文参考訳（メタデータ） (Sat, 08 Nov 2025 09:02:13 GMT)
Vision-Language-Actionモデルにおける課題の整理

月	火	水	木	金	土	日
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30