2026年3月23日 – arXiv最新論文の紹介

AI Can Learn Scientific Taste

AI Can Learn Scientific Taste [123.0]
偉大な科学者は、私たちが科学的な趣味と呼ぶものと密接に結びついている、強い判断と先見性を持っている。ここでは、この用語を用いて、潜在的な影響の高い研究アイデアを判断し、提案する能力を指す。我々の研究結果は、AIが科学的嗜好を学習できることを示し、人間レベルのAI科学者に到達するための重要なステップをマークしている。
論文参考訳（メタデータ） (Sun, 15 Mar 2026 16:31:51 GMT)
「Great scientists possess not only technical skill but also strong judgement and foresight, qualities closely tied to what we call scientific taste [1, 2]. We use the term to refer to the capacity to judge and propose research ideas with high potential impact.」とのことで、科学的なセンスを持ったAIを構築しようという取り組み。ここでは「we use the term scientific taste to refer to the ability to judge and generate research ideas with high potential impact」で通常使われているフロンティアAPIを超えることができたという主張。面白いがこの手の定義は難しい。。
リポジトリはGitHub – tongjingqi/AI-Can-Learn-Scientific-Taste: We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference modeling and alignment problem. · GitHub

Mamba-3: Improved Sequence Modeling using State Space Principles

Mamba-3: Improved Sequence Modeling using State Space Principles [74.4]
線形モデルの状態空間モデル(SSM)の視点に触発された3つの中核的方法論的改善を紹介する。アーキテクチャの改良とともに、Mamba-3モデルは、検索、状態追跡、下流言語モデリングタスク間で大きな進歩を遂げます。
論文参考訳（メタデータ） (Mon, 16 Mar 2026 17:30:08 GMT)
「We combine: (1) a more expressive recurrence derived from SSM discretization, (2) a complex-valued state update rule that enables richer state tracking, and (3) a multi-input, multi-output (MIMO) formulation for better model performance without increasing decode latency.」、「At 1.5B scale, Mamba-3 (MIMO) improves downstream language modeling accuracy by +2.2 over Transformers, +1.9 points over Mamba-2, and +1.8 over GDN, while Mamba-3 (SISO) improves over the next best model, GDN, by +0.6 points.」とMambaの最新版。フロンティアモデルではTransformerと状態空間モデルのハイブリッド構成が多く、期待大。

When AI Navigates the Fog of War

When AI Navigates the Fog of War [23.9]
我々は、現在のフロンティアモデルのトレーニング遮断後に展開された2026年の中東紛争の初期段階について研究する。我々は,11の臨界時間ノード,42のノード固有の検証可能な質問,および5つの一般探索質問を構築した。この研究は、拡大する地政学的危機において、モデル推論のアーカイブスナップショットとして機能する。
論文参考訳（メタデータ） (Tue, 17 Mar 2026 15:13:10 GMT)
「Our analysis suggests three main takeaways. First, model responses often show strong strategic reasoning, going beyond surface rhetoric to attend to structural incentives, particularly in settings involving military posture, deterrence, and material constraints. Second, this capability is uneven across domains: models are generally more reliable in economically and logistically structured settings than in politically ambiguous multi-actor environments. Third, their narratives evolve over time, shifting from early expectations of rapid containment toward more systemic accounts of escalation, exhaustion, and fragile de-escalation.」との主張。
進行中の事象であり振り返るのは必須なのだろうが、進行中の状況であるがゆえに残しておく報告でもあると思う。

月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31