End-to-End Test-Time Training for Long Context

End-to-End Test-Time Training for Long Context [98.4]
アーキテクチャ設計よりも継続学習における問題として,長文言語モデリングを定式化する。我々のモデルは、与えられたコンテキストの次から次までの予測を通じてテスト時に学習を続け、読み込んだコンテキストを重みに圧縮します。全体として、テストタイムトレーニング(TTT)の一形態であるE2E(End-to-End)は、テスト時(次世代の予測)とトレーニング時(メタラーニング)の両方である。
論文参考訳（メタデータ） (Mon, 29 Dec 2025 18:30:14 GMT)
「our model continues learning at test time via next-token prediction on the given context, compressing the context it reads into its weights. In addition, we improve the model’s initialization for learning at test time via meta-learning at training time. Overall, our method, a form of Test-Time Training (TTT), is End-to-End (E2E) both at test time (via next-token prediction) and training time (via meta-learning), in contrast to previous forms.」というTest-Time Trainingに関する報告
リポジトリはGitHub – test-time-training/e2e: Official JAX implementation of End-to-End Test-Time Training for Long Context

コメントを残す

コメントを残す コメントをキャンセル