Diffusion Beats Autoregressive in Data-Constrained Settings

Diffusion Beats Autoregressive in Data-Constrained Settings [46.1]
自己回帰(AR)モデルは長い間、大きな言語モデルのランドスケープを支配してきた。近年,ARモデルよりもアドバンテージが低いものの,拡散型言語モデルが将来性のある選択肢として浮上している。
論文参考訳（メタデータ） (Mon, 21 Jul 2025 17:59:57 GMT)
「In this paper, we systematically study masked diffusion models in data-constrained settings—where training involves repeated passes over limited data—and find that they significantly outperform AR models when compute is abundant but data is scarce. Diffusion models make better use of repeated data, achieving lower validation loss and superior down- stream performance.」という指摘。直観的にもそうだろうと思う。
リポジトリはDiffusion Beats Autoregressive in Data-Constrained Settings

コメントを残す

コメントを残す コメントをキャンセル