2025年4月29日 – arXiv最新論文の紹介

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.1]
機械学習論文を機能コードリポジトリに変換するフレームワークであるPaperCoderを紹介した。 PaperCoderは、計画、分析、生成の3段階で動作する。これは、最近リリースされたPaperBenchベンチマークで一貫して強みを示している。
論文参考訳（メタデータ） (Thu, 24 Apr 2025 01:57:01 GMT)
「(1) Planning, where a high-level implementation plan is constructed based on the paper’s content, including overall plan, architectural design, logic design, and configuration files; (2) Analyzing, where the plan is translated into detailed file-level specifications; and (3) Coding, where the final codes are generated to implement the paper’s methods and experiments.」という三段階のフレームワークの提案。
「Results show that 77% of participants preferred PaperCoder’s implementation over alternatives, and 83% found the outputs practically useful for real-world usage.」と他の実装と比べてよいだけでなく一定有用そうなのも興味深い。

It’s All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

It’s All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization [26.4]
我々は、ニューラルネットワークを連想記憶モジュールとして再認識し、注意バイアスと呼ばれる内部的目的を用いてキーと値のマッピングを学習する。高速並列化可能なトレーニングプロセスを維持しつつ、既存の線形RNNのパワーを超える3つの新しいシーケンスモデル(Moneta、Yaad、Memora)を提示する。例えば、Mirasの特定のインスタンスは、言語モデリング、コモンセンス推論、リコール集約タスクのような特別なタスクで例外的なパフォーマンスを達成し、トランスフォーマーや他の現代的な線形リカレントモデルよりも優れています。
論文参考訳（メタデータ） (Thu, 17 Apr 2025 17:59:33 GMT)
Googleによる新たなアーキテクチャの探索、Mirasフレームワークの提案、Building upon our formulation of memory and forget gate, we present Miras1, a fundamental framework to design novel sequence modeling architectures by four choice of: (1) Attentional bias (i.e., memory objective), (2) Retention gate, (3) Memory architecture, and (4) Memory learning algorithm (i.e., optimizer).
有望なアーキテクチャとしてMoneta, Yaad, Memoraを選定し性能を確認。1.3Bまでと規模が小さめであるが非常に有望な結果に見える。

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning [95.3]
DeepMath-103Kは、約103Kの数学的問題からなる新しい大規模データセットである。各問題は、ルールベースのRLを可能にする検証可能な最終回答を含む。我々は、DeepMath-103Kでトレーニングされたモデルが、挑戦的な数学的ベンチマークにおいて大幅に改善されることを実証した。
論文参考訳（メタデータ） (Tue, 15 Apr 2025 17:59:51 GMT)
「Each problem includes a verifiable final answer, enabling rule-based RL, and three distinct R1-generated solutions suitable for diverse training paradigms like supervised fine-tuning or distillation.」という特徴を持つ数学ベンチマークデータセット
リポジトリはGitHub – zwhe99/DeepMath: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

月	火	水	木	金	土	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30