ThetaEvolve: Test-time Learning on Open Problems

ThetaEvolve: Test-time Learning on Open Problems [110.6]
In-context LearningとReinforcement Learning(RL)の両方をテスト時に効率的にスケールするために、AlphaEvolveを単純化し拡張するオープンソースのフレームワークであるThetaEvolveを紹介します。テスト時にRLを使用するThetaEvolveは、推論のみのベースラインよりも一貫して優れています。
論文参考訳（メタデータ） (Fri, 28 Nov 2025 18:58:14 GMT)
「We introduce ThetaEvolve, an open-source framework that simplifies and extends AlphaEvolve to efficiently scale both in-context learning and Reinforcement Learning (RL) at test time, allowing models to continually learn from their experiences in improving open optimization problems. ThetaEvolve features a single LLM, a large pro- gram database for enhanced exploration, batch sampling for higher throughput, lazy penalties to discourage stagnant outputs, and optional reward shaping for stable training signals, etc.」とOSS版AlphaEvolve的な研究。「(2) Surprisingly, we show that when scaling test-time compute with ThetaEvolve, a single open-source 8B model, DeepSeek-R1-0528-Qwen3-8B (DeepSeek-AI, 2025), can improve the best-known bounds of two open problems considered in AlphaEvolve」と効果を確認している。
リポジトリはGitHub – ypwang61/ThetaEvolve: ThetaEvolve: Test-time Learning on Open Problems

コメントを残す

コメントを残す コメントをキャンセル