2025年8月20日 – arXiv最新論文の紹介

TiMoE: Time-Aware Mixture of Language Experts

TiMoE: Time-Aware Mixture of Language Experts [30.8]
大規模言語モデル(LLM)は通常、Webの固定スナップショットに基づいてトレーニングされる。我々は,2013-2024コーパスの2年スライスを分割し,TiMoEで組み合わせることで,GPTスタイルのエキスパートセットをスクラッチから事前学習することで,この問題に対処する。推論時にTiMoEは、クエリタイムスタンプ後にトレーニングウィンドウが終了するすべての専門家をマスクし、残りのログ確率を共有スペースにマージする。
論文参考訳（メタデータ） (Tue, 12 Aug 2025 10:36:36 GMT)
「TiMoE demonstrates that partitioning pre-training data into strict time slices and blending the resulting GPT-2 experts through a causal, timestamp-aware router yields language models that stay chronologically grounded without a heavy accuracy penalty. By masking out any expert trained on data newer than the query year, TiMoE eliminates future-knowledge leakage while letting earlier specialists cooperate, cutting temporally inconsistent answers on the new 10 k-question TSQA benchmark by roughly 15%and delivering steadier accuracy across years.」というアプローチの時間情報の取り扱い。time-specific expertsを扱う面白いフレームワーク。とはいえパラメータ効率的にどうなんだろうと思わなくはない。
リポジトリはhttps://github.com/epfml/TiMoEとのこと。

Web3 x AI Agents: Landscape, Integrations, and Foundational Challenges

Web3 x AI Agents: Landscape, Integrations, and Foundational Challenges [29.3]
Web3テクノロジとAIエージェントの収束は、分散化されたエコシステムを再形成する、急速に進化するフロンティアを表している。本稿では, ランドスケープ, 経済, ガバナンス, セキュリティ, 信頼メカニズムの5つの重要な側面について, Web3 と AI エージェントの交わりについて, 初めてかつ最も包括的な分析を行った。
論文参考訳（メタデータ） (Mon, 04 Aug 2025 15:44:58 GMT)
「This paper presents the first comprehensive systematic analysis of Web3-AI agent integration, examining 133 active projects with $6.9 billion collective market capitalization to reveal how AI agents fundamentally reshape decentralized ecosystems across the landscape, finance, governance, security, and trust dimensions.」というサーベイ

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation [117.5]
Open X-Embodiment (OXE)のような大規模データセットでトレーニングされた汎用的なロボットポリシーは、幅広いタスクにわたって強力なパフォーマンスを示している。彼らはしばしば、トレーニングデータの分布を超えて一般化するのに苦労する。我々は,ショートカット学習を一般化の鍵となる障害として認識する。
論文参考訳（メタデータ） (Fri, 08 Aug 2025 16:14:01 GMT)
「Our analysis reveals that large-scale robot datasets like OXE suffer from limited sub-dataset diversity and severe fragmentation, a problem that extends even within individual sub-datasets. This structure inherently promotes shortcut learning, meaning that simply adding more similarly-fragmented data can be detrimental to generalization.」とのこと。汎用的なモデル構築は難しい。
プロジェクトサイトはShortcut Learning in GRPs

2025年8月
月	火	水	木	金	土	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31