Self-Improving Embodied Foundation Models – arXiv最新論文の紹介

Self-Improving Embodied Foundation Models [21.8]
ロボット工学における2段階のポストトレーニング手法を提案する。第1段階であるSupervised Fine-Tuning (SFT) は、a) 行動クローニングとb) ステップ・トゥ・ゴーの予測目的の両方を用いたファインチューン事前訓練基礎モデルである。第2段階では、ステップ・トゥ・ゴー予測により、良好な形状の報酬関数と堅牢な成功検出器の抽出が可能となる。
論文参考訳（メタデータ） (Thu, 18 Sep 2025 17:00:08 GMT)
「1) Supervised Fine-Tuning (SFT) wherein we fine-tune EFMs using behavioral cloning as well as “steps-to-go” prediction objectives, and 2) Self-Improvement (Online RL) wherein EFMs autonomously practice downstream tasks and rapidly improve via optimizing self-predicted rewards.」というアプローチの提案（EFM= Embodied Foundation Models）。「Finally, we demonstrated that this novel combination uniquely unlocks a capability not possible by current methods: autonomously aquiring new skills that generalize far beyond the tasks covered in the imitation learning datasets. These findings highlight the transformative potential of combining pretrained foundation models with online Self- Improvement to enable autonomous skill acquisition in robotics.」と効果があったとのこと。
プロジェクトサイトはAnonymous Supplementary Videos for “On the Magic of Online Self-Improvement for Embodied Multimodal Foundation Models”

コメントを残す

月	火	水	木	金	土	日
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

コメントを残す コメントをキャンセル

コメントを残すコメントをキャンセル