2025年10月23日 – arXiv最新論文の紹介

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks [23.2]
大規模言語モデルは、長期のエージェントタスクにおいて課題に直面します。既存のワーキングメモリメソッドは、エージェントのコアポリシーから切り離された外部メカニズムに依存している。本稿では,一貫したポリシーの一部として明示的な編集操作を実行することで,エージェントが作業メモリを積極的に管理する新しいフレームワーク,Memory-as-Actionを提案する。
論文参考訳（メタデータ） (Tue, 14 Oct 2025 15:29:57 GMT)
「This work introduces Memory-as-Action, a framework that treats working memory management as an integral part of an agent’s decision-making process, rather than as an external module. By formalizing memory operations as explicit actions, a single policy can learn to interleave task reasoning with context curation.」というフレームワークの提案、作業領域管理と推論を同時管理する手法の優位性を主張。

FastUMI-100K: Advancing Data-driven Robotic Manipulation with a Large-scale UMI-style Dataset [55.7]
我々は,大規模なUMIスタイルのマルチモーダルデモデータセットであるFastUMI-100Kを提案する。 FastUMI-100Kは、現実世界のロボットデモデータの多様な要求を満たすために、よりスケーラブルで柔軟性があり、適応可能なソリューションを提供する。我々のデータセットは、エンドエフェクタ状態、多視点手首装着魚眼画像、テキストアノテーションを含むマルチモーダルストリームを統合している。
論文参考訳（メタデータ） (Thu, 09 Oct 2025 09:57:25 GMT)
「Utilizing the FastUMI data collection system [21], we in- tegrated single-arm and dual-arm configurations with adapt- able universal finger sleeves to conduct large-scale data collection. In this paper, we introduce the large-scale UMI- style multimodal dataset—FastUMI-100K, which incorpo- rates the dataset of the pioneering work FastUMI and totally comprises over 100,000 demonstration trajectories, collected using both single-arm and dual-arm grippers on the FastUMI platform, equivalent to 600 hours of interactive data.」というデータセット。
リポジトリはGitHub – MrKeee/FastUMI-100K

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization [103.7]
ロングチェーンのリフレクティブ推論は、複雑な現実世界の問題を解決するための前提条件である。我々は42の難解な合成タスクの1,260のサンプルからなるベンチマークを構築した。トレーニング後のデータを生成し、そのようなデータを活用するための学習パラダイムを探索する。
論文参考訳（メタデータ） (Thu, 09 Oct 2025 17:53:58 GMT)
「MM-HELIX contains 42 meticulously curated challeng- ing tasks from diverse online sources, categorized into four domains: Algorithm, Graph, Puzzle, and Game. Each task requires the model to perform careful visual observation, develop a deep understanding of complex rules, and generate an extended chain-of-thought that necessitates reflec- tion and backtracking.」という試行、失敗、修正のような長い思考を必要とするベンチマークの提案。GPT-5の性能が高くOSSモデルとの性能差が大きい。
プロジェクトサイトはMM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization