2025年3月11日 – arXiv最新論文の紹介

An Overview of Large Language Models for Statisticians

An Overview of Large Language Models for Statisticians [109.4]
大規模言語モデル(LLM)は人工知能(AI)の変換ツールとして登場した。本稿では, 統計学者がLLMの開発に重要な貢献できる可能性について考察する。我々は不確実性定量化、解釈可能性、公正性、プライバシー、透かし、モデル適応といった問題に焦点を当てる。
論文参考訳（メタデータ） (Tue, 25 Feb 2025 03:40:36 GMT)
LLMと統計学に関するサーベイ。教科書的な内容。
利用者目線だと「LLM-Empowered Statistical Analysis」が興味深い。

Wikipedia in the Era of LLMs: Evolution and Risks [2.7]
既存のデータを通じてウィキペディアにおけるLarge Language Models (LLM) の影響を分析し、シミュレーションを用いて潜在的なリスクを探索する。その結果,Wikipedia の記事は LLM の影響を受けており,特定のカテゴリーの約1%-2% が影響していることがわかった。
論文参考訳（メタデータ） (Tue, 04 Mar 2025 18:58:13 GMT)
LLMがwikipediaに与えている影響の調査、「While the estimation results vary, the influence of LLMs on Wikipedia is likely to become more significant over time.In some categories, the impact has exceeded 2%.」とのこと。
翻訳やRAGの評価用データとして使う場合には気を付ける必要がある。（論文中では「If the sentences in machine translation benchmarks are drawn from Wikipedia content shaped by LLMs, the scores of machine translation models are likely to be inflated, potentially reversing the outcomes of comparisons between different models.」、「Wikipedia content processed by LLMs could appear less effective for RAG compared to real Wikipedia content.」と指摘している）

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking [96.9]
我々は,工学的問題に対する完全かつ実現可能なソリューションを生成するシステムの能力を評価するために,新しいベンチマークであるSolutionBenchを導入する。本稿では,木に基づく探索と二点思考機構を利用して信頼性の高いソリューションを生成する新しいシステムであるSolutionRAGを提案する。
論文参考訳（メタデータ） (Fri, 28 Feb 2025 05:23:10 GMT)
工学の問題に対するソリューションを生成するベンチマークSolutionBenchと、それを解く手法SolutionRAGの提案。RAGとあるが「 SolutionRAG employs a bi-point thinking approach, alternating between solution design and review, gradually enhancing the solution’s completeness and reliability.」というツリーを作りながらの探索でAgenticなアプローチ。
リポジトリはGitHub – Li-Z-Q/DeepSolution: DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking