2026年3月27日 – arXiv最新論文の紹介

MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution [52.3]
メモリ拡張LDMエージェントは、長期の相互作用をサポートするために外部メモリバンクを保持する。 MemMAはプラグアンドプレイのマルチエージェントフレームワークで、前方と後方の両方の経路に沿ってメモリサイクルを調整する。
論文参考訳（メタデータ） (Thu, 19 Mar 2026 10:15:59 GMT)
「We introduce MEMMA, a plug-and-play multi- agent framework that coordinates the memory cycle along its forward and backward paths. On the forward path, a Meta-Thinker separates strategic reasoning from low-level execution, addressing strategic blindness in construction and retrieval.
On the backward path, in-situ self-evolution converts probe QA failures into direct memory repair before the memory is committed. 」と双方向からメモリを改善していくアプローチ。
リポジトリはGitHub – ventr1c/memma · GitHub

ConflictBench: Evaluating Human-AI Conflict via Interactive and Visually Grounded Environments [43.1]
我々は150のマルチターンシナリオを通じて人間とAIの対立を評価するベンチマークであるConflictBenchを紹介した。 ConflictBenchはテキストベースのシミュレーションエンジンと視覚的に接地された世界モデルを統合し,動的条件下でのエージェントの知覚,計画,行動を可能にする。
論文参考訳（メタデータ） (Mon, 09 Mar 2026 06:59:48 GMT)
「we introduce ConflictBench, a benchmark designed to evaluate human–AI conflict through interactive, multi-turn, and multi- modal protocols that better reflect the complex trade-offs agents may face when their goals conflict with human interests.」というベンチマーク。GPT-5、Qwenのスコアが良くこのあたりの対策もされているのだろうか・・・