Survey – ページ 8 – arXiv最新論文の紹介

Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey

Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey [6.7]
VLM(Multimodal Vision Language Models)は、コンピュータビジョンと自然言語処理の交差点において、トランスフォーメーション技術として登場した。 VLMは、視覚的およびテキスト的データに対して強力な推論と理解能力を示し、ゼロショット分類において古典的な単一モダリティ視覚モデルを上回る。
論文参考訳（メタデータ） (Sat, 04 Jan 2025 04:59:33 GMT)
「we provide a systematic overview of VLMs in the following aspects: [1] model information of the major VLMs developed over the past five years (2019-2024); [2] the main architectures and training methods of these VLMs; [3] summary and categorization of the popular benchmarks and evaluation metrics of VLMs; [4] the applications of VLMs including embodied agents, robotics, and video generation; [5] the challenges and issues faced by current VLMs such as hallucination, fairness, and safety.」とVLMのサーベイ。
リポジトリはGitHub – zli12321/VLM-surveys: A most Frontend Collection and survey of vision-language model papers, and models GitHub repository

Open Problems in Machine Unlearning for AI Safety

Open Problems in Machine Unlearning for AI Safety [61.4]
特定の種類の知識を選択的に忘れたり、抑圧したりするマシンアンラーニングは、プライバシとデータ削除タスクの約束を示している。本稿では,アンラーニングがAI安全性の包括的ソリューションとして機能することを防止するための重要な制約を特定する。
論文参考訳（メタデータ） (Thu, 09 Jan 2025 03:59:10 GMT)
重要技術ではあるが実用化に至っていない雰囲気のあるMachine unlearningに関するサーベイ。主に課題面にフォーカスしている。
結論の「Current approaches to neural-level interventions often produce unintended effects on broader model capabilities, adding practical challenges to selective capability control, while the difficulty of verifying unlearning success and robustness against relearning raises additional concerns. Furthermore, unlearning interventions can create tensions with existing safety mechanisms, potentially affecting their reliability.」は現状を端的に表している。。。

LLM4SR: A Survey on Large Language Models for Scientific Research

LLM4SR: A Survey on Large Language Models for Scientific Research [15.5]
大きな言語モデル(LLM)は、研究サイクルの様々な段階にわたって前例のないサポートを提供する。本稿では,LLMが科学的研究プロセスにどのように革命をもたらすのかを探求する,最初の体系的な調査について述べる。
論文参考訳（メタデータ） (Wed, 08 Jan 2025 06:44:02 GMT)
LLM、特にAgenticな動作が流行って以降、実用性がでてきている感のある研究へのLLM利用に関するサーベイ。仮説を作るところからピアレビューまで一連のプロセスを対象にしている。

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey [93.7]
Next Token Prediction (NTP)は、機械学習タスクの多目的な学習目標である。本調査では,マルチモーダル学習における理解と生成を一体化する包括的分類法を導入する。提案した分類法は,マルチモーダルトークン化,MMNTPモデルアーキテクチャ,統合タスク表現,データセットと評価,オープンチャレンジの5つの重要な側面を網羅している。
論文参考訳（メタデータ） (Mon, 30 Dec 2024 03:00:30 GMT)
一般的なテクニックとなったNext token predictionのサーベイ、マルチモーダルな学習を対象にしている。
リポジトリはGitHub – LMM101/Awesome-Multimodal-Next-Token-Prediction: Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Knowledge Boundary of Large Language Models: A Survey

Knowledge Boundary of Large Language Models: A Survey [75.7]
大規模言語モデル(LLM)はパラメータに膨大な量の知識を格納するが、特定の知識の記憶と利用に制限がある。これは、LLMの知識境界を理解するための重要な必要性を強調している。本稿では,LLM知識境界の包括的定義を提案し,知識を4つの異なるタイプに分類する形式化された分類法を提案する。
論文参考訳（メタデータ） (Tue, 17 Dec 2024 02:14:02 GMT)
LLMの知識境界に関するサーベイ
面白い視点

GUI Agents: A Survey

GUI Agents: A Survey [129.9]
グラフィカルユーザインタフェース(GUI)エージェントは、人間とコンピュータのインタラクションを自動化するためのトランスフォーメーションアプローチとして登場した。 GUIエージェントの関心の高まりと基本的な重要性により、ベンチマーク、評価指標、アーキテクチャ、トレーニングメソッドを分類する総合的な調査を提供する。
論文参考訳（メタデータ） (Wed, 18 Dec 2024 04:48:28 GMT)
GUIをつかうエージェントに関するサーベイ

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods [21.6]
「LLMs-as-judges」は自然言語応答に基づく評価器である。本稿では,5つの重要な視点から’LLMs-as-judges’パラダイムを包括的に調査する。我々は,研究と実践の両方において,’LLMs-as-judges’の開発と適用に関する洞察を提供することを目的としている。
論文参考訳（メタデータ） (Sat, 07 Dec 2024 08:07:24 GMT)
最近多い、LLMs-as-Judgesのサーベイ。複数束ねるアプローチが多くなってきている印象もある
リポジトリGitHub – CSHaitao/Awesome-LLMs-as-Judges: The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.も参考になる

A Survey on LLM Inference-Time Self-Improvement

A Survey on LLM Inference-Time Self-Improvement [15.0]
近年,テスト時の計算量の増加による推論向上技術が注目されている。本稿では,最近の研究を包括的にレビューし,詳細な分類学に貢献し,課題と限界について議論する。
論文参考訳（メタデータ） (Wed, 18 Dec 2024 21:37:07 GMT)
最近要注目（？）のInference time self improvementのサーベイ。「We classify these methods into three categories: Independent Self-Improvement, which operates independently; Context-Aware Self-Improvement, which leverages external support (i.e. context and datastore retrieval); and Model-Aided Self-Improvement, which relies on external models for collaboration.」という軸で整理。
リポジトリはGitHub – dongxiangjue/Awesome-LLM-Self-Improvement: A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced “itsy”) papers from our recent survey: A Survey on Large Language Model Inference-Time Self-Improvement.

Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice [186.1]
非学習はしばしば、生成AIモデルからターゲット情報の影響を取り除くソリューションとして呼び出される。未学習はまた、モデルが出力中にターゲットとなるタイプの情報を生成するのを防ぐ方法として提案されている。これら2つの目標 – モデルからの情報の標的的除去と、モデル出力からの情報のターゲット的抑制 – は、様々な技術的および現実的な課題を表す。
論文参考訳（メタデータ） (Mon, 09 Dec 2024 20:18:43 GMT)
Machine unlearningに関する包括的な情報。「despite the intuitive alignment of the meanings of the words “removal” and “deletion,” it is unclear if technical removal is indeed necessary to satisfy deletion requirements in law and policy.」など技術的な部分以外への言及に力を入れた整理でとても参考になる。

A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios

A Survey on Large Language Model-Based Social Agents in Game-Theoretic Scenarios [44.0]
ゲーム理論のシナリオは、Large Language Model(LLM)ベースのソーシャルエージェントの社会的インテリジェンスを評価する上で重要なものとなっている。本調査では,研究成果をゲームフレームワーク,ソーシャルエージェント,評価プロトコルの3つのコアコンポーネントにまとめる。
論文参考訳（メタデータ） (Thu, 05 Dec 2024 06:46:46 GMT)
ゲーム理論な文脈でのLLM based Agentsのサーベイ。

2025年8月
月	火	水	木	金	土	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31