社会実装 – ページ 2 – arXiv最新論文の紹介

Measurement of LLM’s Philosophies of Human Nature

Measurement of LLM’s Philosophies of Human Nature [113.5]
大規模言語モデル(LLM)を対象とする標準化された心理尺度を設計する。現在のLSMは、人間に対する信頼の欠如を示す。本稿では,LLMが継続的に価値体系を最適化できるメンタルループ学習フレームワークを提案する。
論文参考訳（メタデータ） (Thu, 03 Apr 2025 06:22:19 GMT)
「Machinebased Philosophies of Human Nature Scale (M-PHNS)」とLLMの人間性に対する評価を行うツールの提案。「Most models exhibit varying degrees of negative tendencies, such as perceiving humans as untrustworthy, selfish, and volatile. These tendencies intensify as the intelligence level of the model increases. This phenomenon is consistent regardless of the model’s developer or whether the model is open-source.」という結果が面白い。これらを修正するフレームワークも提案しているが、これが良いのかは若干謎。
リポジトリはkodenii/M-PHNS · GitHub

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models [101.7]
MMFM(Multimodal foundation model)は、自律運転、ヘルスケア、バーチャルアシスタントなど、様々なアプリケーションにおいて重要な役割を果たす。既存のマルチモーダルモデルのベンチマークは、主にこれらのモデルの有用性を評価するか、公平性やプライバシといった限られた視点にのみフォーカスする。 MMFMの安全性と信頼性を総合的に評価するために,最初の統合プラットフォームMMDT(Multimodal DecodingTrust)を提案する。
論文参考訳（メタデータ） (Wed, 19 Mar 2025 01:59:44 GMT)
Multimodal foundation modelsの信頼性評価フレームワークの提案。主な対象はsafety, hallucination, fairness, privacy, adversarial robustness, out-of-distribution (OOD) robustness。MMFMsということでT2I、I2Tの両方が含まれる。
プロジェクトサイトはMMDecodingTrust Benchmark、リーダーボードも存在するMMDecodingTrust Benchmark。公開モデルより商用モデルの方が平均的にはスコアが高そうだが、評価軸によって状況が大きく異なるのが興味深い。

Generative Models in Decision Making: A Survey

Generative Models in Decision Making: A Survey [63.7]
生成モデルは、高逆状態反応領域や中間部分ゴールへエージェントを誘導する軌道を生成することによって意思決定システムに組み込むことができる。本稿では,意思決定タスクにおける生成モデルの適用について概説する。
論文参考訳（メタデータ） (Mon, 24 Feb 2025 12:31:28 GMT)
生成モデル（Energy Based Models (EBMs), Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Normalizing Flow (NFs), Diffusion Models (DMs), GFlowNets (GFNs), and Autoregressive Models (AMs).）と意思決定のサーベイ。アプリケーションは「robot control, autonomous driving, games, structural generation, and optimization.」を想定。

Shh, don’t say that! Domain Certification in LLMs

Shh, don’t say that! Domain Certification in LLMs [124.6]
大きな言語モデル(LLM)は狭いドメインで制約されたタスクを実行するためにしばしばデプロイされる。ドメイン認証は、言語モデルのドメイン外動作を正確に特徴付ける保証である。次に, 逆境界を証明として提供するVALIDを, 単純かつ効果的なアプローチとして提案する。
論文参考訳（メタデータ） (Wed, 26 Feb 2025 17:13:19 GMT)
任意の入力がある状況下で狙ったドメイン以外の回答をしないようにする手法、Verified Adversarial LLM Output via Iterative Dismissal (VALID)の提案。

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path? [37.1]
未確認のAI機関は、公共の安全とセキュリティに重大なリスクをもたらす。これらのリスクが現在のAIトレーニング方法からどのように生じるかについて議論する。我々は,非エージェント型AIシステムの開発をさらに進めるために,コアビルディングブロックを提案する。
論文参考訳（メタデータ） (Mon, 24 Feb 2025 18:14:15 GMT)
「As we implement agentic AI systems, we should ask ourselves whether and how these less desirable traits will also arise in the artificial setting, especially in the case of anticipated future AI systems with intelligence comparable to humans (often called AGI, for artificial general intelligence) or superior to humans (ASI, for artificial superintelligence).」というASI,AGI＋Agenticな状況でとるべき対応についての論文。Yoshua BengioがLead author

Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey

Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.4]
Retrieval-Augmented Generation (RAG)は、AIGC(AIGC)の課題に対処するために設計された高度な技術である。 RAGは信頼性と最新の外部知識を提供し、幻覚を減らし、幅広いタスクで関連するコンテキストを保証する。 RAGの成功と可能性にもかかわらず、最近の研究により、RAGパラダイムはプライバシーの懸念、敵対的攻撃、説明責任の問題など、新たなリスクももたらしていることが示されている。
論文参考訳（メタデータ） (Sat, 08 Feb 2025 06:50:47 GMT)
RAG、Trustworthyのサーベイ。実用上様々な考慮点があるとはいえ、この観点でサーベイが必要な状況に若干驚き。
リポジトリはGitHub – Arstanley/Awesome-Trustworthy-Retrieval-Augmented-Generation、論文リストが公開されている。

Generative AI and Creative Work: Narratives, Values, and Impacts

Generative AI and Creative Work: Narratives, Values, and Impacts [37.2]
私たちは、オンラインメディアをレビューし、彼らが伝達するクリエイティブな仕事に対するAIの影響に関する支配的な物語を分析します。この談話は、人的労働を通じて物質的実現から解放された創造性を促進する。この言説は、支配的なテクノ実証主義のビジョンに対応し、創造的経済と文化に対する権力を主張する傾向にある。
論文参考訳（メタデータ） (Thu, 06 Feb 2025 10:26:56 GMT)
「In this article, we review online media outlets and analyze the dominant narratives around AI’s impact on creative work that they convey.」
参入障壁の低下が良いことなのか、アイデアと実行でアイデアの重要性（比率）が上がるのは好ましいのか、などは人によって考え方が異なるとはいえ、テクノロジーの普及は止められない。。それはそれとして「For example, we believe that five years ago, narratives of generative AI in art emphasized the replacement of artists by technology, whereas current narratives focus more on augmentation and collaboration.」は本当なんだろうか・・・という疑問も。

Human Decision-making is Susceptible to AI-driven Manipulation

Human Decision-making is Susceptible to AI-driven Manipulation [71.2]
AIシステムは、ユーザの認知バイアスと感情的な脆弱性を利用して、有害な結果に向けてそれらを操縦する。本研究では、経済的・感情的な意思決定の文脈におけるこのような操作に対する人間の感受性について検討した。
論文参考訳（メタデータ） (Tue, 11 Feb 2025 15:56:22 GMT)
「Our randomized control trial with 233 participants demonstrated that human decision-making is highly susceptible to AI-driven manipulation, with participants significantly shifting preferences toward harmful options and away from beneficial choices when interacting with manipulative AI agents.」という衝撃的な結果。「strategy-enhanced manipulative agent (SEMA) employing
established psychological tactics to reach its hidden objectives.」の有効性がいまいちだった理由はそんなものを使わなくてもAIが強力だったとするんだろうか。
今後、AIへの依存度が高まっていくこと、AIの性能自体が上がっていくことを考えると怖い結果。規制の必要性を主張しているがそれだけで十分とは思えない。。。

International AI Safety Report

International AI Safety Report [229.3]
報告書は英国ブレッチリーで開催されたAI Safety Summitに出席する各国によって委任された。 30カ国、国連、OECD、EUはそれぞれ報告書の専門顧問パネルの代表を指名した。合計で100人のAI専門家が貢献し、さまざまな視点と規律を表現した。
論文参考訳（メタデータ） (Wed, 29 Jan 2025 17:47:36 GMT)
先端AIのリスクをまとめた報告書、非常に参考になる。
XユーザーのYoshua Bengioさん: 「Today, we are publishing the first-ever International AI Safety Report, backed by 30 countries and the OECD, UN, and EU. It summarises the state of the science on AI capabilities and risks, and how to mitigate those risks. 🧵 Link to full Report: https://t.co/k9ggxL7i66 1/16 https://t.co/68Gcm4iYH5」 / X で概要が議長であるYoshua Bengioによって解説されている。

Towards Best Practices for Open Datasets for LLM Training

Towards Best Practices for Open Datasets for LLM Training [21.4]
多くのAI企業は、著作権所有者の許可なく、データ上で大きな言語モデル(LLM)をトレーニングしています。創造的なプロデューサーは、いくつかの著名な著作権訴訟を引き起こした。データ情報を制限するこの傾向は、透明性、説明責任、革新を妨げることによって害をもたらす。
論文参考訳（メタデータ） (Tue, 14 Jan 2025 17:18:05 GMT)
学習等に使用するデータセットを選ぶベストプラクティスの整理、「The permissibility of doing so varies by jurisdiction: in countries like the EU and Japan, this is allowed under certain restrictions, while in the United States, the legal landscape is more ambiguous.」とはあるが日本でもとても大事な内容。

2025年8月
月	火	水	木	金	土	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31