arXiv最新論文の紹介

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life [46.1]
日常生活で遭遇した1,360の道徳的ジレンマのデータセットであるDailyDilemmasを提示する。それぞれのジレンマは2つの可能なアクションを含み、それぞれのアクションでは、影響を受ける当事者と人間の価値が呼び出される。我々は、社会学、心理学、哲学に触発された5つの一般的な理論のレンズを通して、これらの価値を分析した。
論文参考訳（メタデータ） (Thu, 03 Oct 2024 17:08:52 GMT)
道徳的ジレンマのデータセット
リポジトリはhttps://github.com/kellycyy/daily_dilemmas

LLMs Are In-Context Reinforcement Learners

LLMs Are In-Context Reinforcement Learners [30.2]
大規模言語モデル(LLM)は、コンテキスト内教師あり学習(ICL)を通じて新しいタスクを学習することができる。この研究は、この能力が文脈内強化学習(ICRL)にまで拡張されるかどうかを研究する。本稿では、テスト時間計算の増加と計算バウンド近似により、この欠陥に対処するアルゴリズムを提案する。
論文参考訳（メタデータ） (Mon, 07 Oct 2024 17:45:00 GMT)
「ICRL is a natural combination of ICL and reinforcement learning (RL).Instead of constructing the LLM context from supervised input-output pairs, the LLM context is constructed using triplets consisting of input, model output prediction, and the corresponding rewards.」というスタイルのインコンテキスト強化学習の提案。ナイーブな実装がうまくいかないのが興味深い。「Its poor performance is due to its incapacity to explore the output space.」とのこと。
プロジェクトサイトはLLMs Are In-Context Reinforcement Learners (lil-lab.github.io)

A Survey on In-context Learning

A Survey on In-context Learning [77.8]
In-context Learning (ICL) は自然言語処理(NLP)の新しいパラダイムとして登場した。まず、ICLの形式的定義を示し、関連する研究との相関を明らかにする。次に、トレーニング戦略、迅速なデザイン戦略、関連する分析を含む高度なテクニックを組織化し、議論する。
論文参考訳（メタデータ） (Fri, 27 Sep 2024 02:55:06 GMT)
In-context learningのサーベイ

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [117.9]
G”odel AgentはG”odelマシンにインスパイアされた自己進化型フレームワークである。 G”odel Agentは、パフォーマンス、効率、一般化性において手作業によるエージェントを上回る、継続的な自己改善を実現することができる。
論文参考訳（メタデータ） (Sun, 06 Oct 2024 10:49:40 GMT)
「we introduce G¨odel Agent, a self-evolving framework inspired by the G¨odel machine, enabling agents to recursively improve themselves without relying on predefined routines or fixed optimization algorithms.」と自己改善していけるエージェントを提案、効果を確認とのこと。エージェント的改善を行っていくフレームワークでLLM自体を改善するような実装ではなさそう。
「Currently, G¨odel Agent is not sufficiently stable and may be prone to error accumulation, hindering its ability to continue self-optimization.」とのことではあるが、この手の研究が進んでいくのは未来を感じる。
リポジトリはGitHub – Arvid-pku/Godel_Agent: Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement

DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory

DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory [96.4]
大規模言語モデル(LLM)のための文書レバレッジ翻訳エージェントであるDelTAを紹介する。 DelTAは、様々な粒度とスパンにまたがる情報を格納するマルチレベルメモリ構造を備えている。実験結果から,DelTAは翻訳の一貫性や品質において,強いベースラインを著しく上回ることがわかった。
論文参考訳（メタデータ） (Thu, 10 Oct 2024 17:30:09 GMT)
LLMを利用した機械翻訳エージェント。Proper Noun Records、Bilingual Summary、Long-Term Memory、Short-Term Memoryを持つ。
リポジトリはGitHub – YutongWang1216/DocMTAgent: Code and data releases for the paper — DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory

Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models

Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models [105.0]
我々は,MAETと命名された多言語能力抽出と伝達手法を提案する。我々のキーとなる考え方は、大きな言語モデルから言語に依存しない能力に関する重みを分解し抽出することである。実験結果から,MAETは高度能力の抽出と伝達を効果的に行うことができ,トレーニングベースライン法よりも優れることがわかった。
論文参考訳（メタデータ） (Thu, 10 Oct 2024 11:23:18 GMT)
「Our key idea is to decompose and extract language-agnostic ability-related weights from LLMs, and transfer them across different languages by simple addition and subtraction operations without training.」という多言語能力の抽出とそのモデルマージ手法、MEAT: Multi-lingual Ability Extraction and Transfer approachを提案。「Our approach MAET achieves better performance than the competitive baseline methods (e g , continual pre-training and model merging with task vector) in multi-lingual complex reasoning tasks, including mathematical reasoning tasks and scientific reasoning tasks.」とのこと。
リポジトリはhttps://github.com/RUCAIBox/MAET

SELU: Self-Learning Embodied MLLMs in Unknown Environments

SELU: Self-Learning Embodied MLLMs in Unknown Environments [35.6]
マルチモーダルな大言語モデル(MLLM)は、強力な視覚的理解と意思決定能力を示している。本稿では,強化学習におけるアクター批判的自己学習パラダイムに触発された,SELUと呼ばれる新しいアクター批判的自己学習パラダイムを提案する。
論文参考訳（メタデータ） (Fri, 04 Oct 2024 10:40:11 GMT)
「We propose a self-learning paradigm for embodied MLLMs, SELU, inspired by the actorcritic paradigm in reinforcement learning, which enables MLLMs to self-adapt to unknown environments.」というSelf-XでEmbodiedというとても未来を感じる研究。
環境に対するActorに対してMLLM Criticが評価するという、最近流行りのフレームワークだが、Actor MLLMとClitic MLLMをそれぞれfine tuningしていくことに特徴がある（同じMLLMを使うSELU Oneより優れているとのこと）

Biased AI can Influence Political Decision-Making

Biased AI can Influence Political Decision-Making [64.9]
本稿では、AI言語モデルにおけるパルチザンバイアスが政治的意思決定に及ぼす影響について検討する。政治的に偏見のあるモデルに晒された参加者は、意見を採用し、AIの偏見と一致した決定を下す可能性が著しく高いことがわかった。
論文参考訳（メタデータ） (Tue, 08 Oct 2024 22:56:00 GMT)
「We found that participants exposed to politically biased models were significantly more likely to adopt opinions and make decisions aligning with the AI’s bias, regardless of their personal political partisanship.」、「However, we also discovered that prior knowledge about AI could lessen the impact of the bias, highlighting the possible importance of AI education for robust bias mitigation.」という指摘。教育の効果はあるようだが、今後問題は大きくなっていくんじゃないかと思う。。

Data Selection via Optimal Control for Language Models

Data Selection via Optimal Control for Language Models [134.7]
本研究は,大規模コーパスから高品質な事前学習データを選択することにより,下流利用におけるLMの能力を向上させることを目的とする。 PMP条件を解くことで最適なデータ選択を近似するフレームワークであるPMPベースのデータ選択(PDS)を導入する。 PDSの利点は、スケーリング法則に従ってテスト損失曲線の外挿によって証明されたように、10Tトークンでトレーニングされた400Bモデルにまで拡張される。
論文参考訳（メタデータ） (Wed, 09 Oct 2024 17:06:57 GMT)
「by treating data selection as the control variables (i.e., whether a data point is included in pre-training), the LM pre-training process as the dynamic system, and the LM’s downstream performance as the objective, we leverage Pontryagin’s Maximum Principle (PMP; 63) to derive the necessary conditions for optimal data selection in theory.」という制御理論を応用したデータセレクション手法の提案。「The overhead of running PDS to select data is only about 1/9 of that of pre-training a 1.7B model.」と実用的に思える。
プロジェクトサイトはAdvancing AI for Humanity (thegenerality.com)、リポジトリはLMOps/data_selection at main · microsoft/LMOps · GitHub

Agent S: An Open Agentic Framework that Uses Computers Like a Human

Agent S: An Open Agentic Framework that Uses Computers Like a Human [31.2]
我々は、GUI(Graphical User Interface)を通じてコンピュータとの自律的なインタラクションを可能にするオープンエージェントフレームワークであるAgent Sを提案する。 Agent Sは、ドメイン固有の知識の取得、長いタスクの水平線の計画、動的で一様でないインターフェイスの処理という、コンピュータタスクの自動化における3つの重要な課題に対処することを目指している。
論文参考訳（メタデータ） (Thu, 10 Oct 2024 17:43:51 GMT)
人が操作するようにコンピュータを操作するエージェントフレームワークの提案
リポジトリはGitHub – simular-ai/Agent-S: Official codebase for Agent S, a open agentic framework that uses computers like a human

2026年2月
月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28