World-in-World: World Models in a Closed-Loop World

World-in-World: World Models in a Closed-Loop World [123.9]
我々は,実エージェントと環境の相互作用を反映したクローズドループの世界において,世界モデルをベンチマークする最初のオープンプラットフォームであるWorld-in-Worldを紹介した。多様なWMを厳格に評価し、タスク成功を主要な指標として優先順位付けし、視覚的品質に重点を置く4つのクローズドループ環境をキュレートする。 1)視覚的品質だけではタスクの成功は保証されないが、制御可能性の方が重要であること、2) 行動観測データによる後トレーニングのスケーリングは、事前訓練されたビデオジェネレータをアップグレードするよりも効果的であること、3) 推論時計算の割り当てにより、WMsは大幅にクローズドな改善が可能であること、の3つのサプライズを明らかにした。
論文参考訳（メタデータ） (Mon, 20 Oct 2025 22:09:15 GMT)
World model としてのViusual Generationモデルに対してのベンチマーク。VisualなクオリティとWorld modelとしてのクオリティにはギャップがあるとの指摘。
- We introduce World-in-World, the first comprehensive closed-loop benchmark that evaluates world models through the lens of embodied interaction, moving beyond the common focus on generation quality. • We propose a unified closed-loop planning strategy with a unified action API, allowing diverse world models to be seamlessly integrated and evaluated within a single framework across four embodied tasks.
- We introduce World-in-World, the first comprehensive closed-loop benchmark that evaluates world models through the lens of embodied interaction, moving beyond the common focus on generation quality.
- We propose a unified closed-loop planning strategy with a unified action API, allowing diverse world models to be seamlessly integrated and evaluated within a single framework across four embodied tasks.
- We discover that high visual quality does not necessarily guarantee task success, and demon- strate how the performance of pretrained video generators can be substantially improved through training-time data scaling and inference-time scaling.
プロジェクトサイトはWorld-in-World: World Models in a Closed-Loop World

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

コメントを残す コメントをキャンセル

コメントを残すコメントをキャンセル