LLM – ページ 9 – arXiv最新論文の紹介

NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks

NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks [30.2]
NeuZipはニューラルネットワークにおける浮動小数点数のエントロピーに基づく新しい重み圧縮方式である。 Llama-3 8Bモデルのメモリフットプリントを31GBから16GB以下に大幅に削減した。推定では, ほぼロスレス性能を維持しながら, メモリ使用量を半減することができる。
論文参考訳（メタデータ） (Mon, 28 Oct 2024 01:12:20 GMT)
ニューラルネットワークの圧縮（メモリ削減）手法の提案。量子化などと異なりロスレスで実用的と思われる手法であるのが興味深い。不可逆な手法でも「The lossy NeuZip provides additional memory saving for inference, achieving superior memory–performance trade-off.」とのこと。
リポジトリはGitHub – BorealisAI/neuzip: Official repository for the paper “NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks”. This repository contains the code for the experiments in the paper.

Two are better than one: Context window extension with multi-grained self-injection

Two are better than one: Context window extension with multi-grained self-injection [111.1]
SharedLLMは、多粒度コンテキスト圧縮とクエリ対応情報検索の設計哲学に基づく新しいアプローチである。本研究では,テキストチャンクの多粒度コンテキスト情報を効率的にエンコードし,保存し,検索するための木構造データ構造を提案する。
論文参考訳（メタデータ） (Fri, 25 Oct 2024 06:08:59 GMT)
2つのLLMを組み合わせ１つ目をコンテキスト圧縮、２つ目をデコーダーとしてつかうSharedLLMの提案。普通のencoder-decoder modelではなく、階層的な構造を持っているのが特徴。
リポジトリはGitHub – Clement25/SharedLLM: Official Implementation of the paper: “Two are better than one: Context window extension with multi-grained self-injection”

Improving Causal Reasoning in Large Language Models: A Survey、LLM-based Optimization of Compound AI Systems: A Survey

因果推論や最適化の分野でもLLMが活用されつつある。

Improving Causal Reasoning in Large Language Models: A Survey [16.6]
因果推論は知性の重要な側面であり、問題解決、意思決定、世界理解に不可欠である。大規模言語モデル(LLM)は出力に対して有理性を生成することができるが、因果推論を確実に行う能力は未だ不明である。
論文参考訳（メタデータ） (Tue, 22 Oct 2024 04:18:19 GMT)
リポジトリはGitHub – chendl02/Awesome-LLM-Causal-Reasoning: Awesome LLM Causal Reasoning is a collection of LLM-based casual reasoning works, including papers, codes and datasets.

LLM-based Optimization of Compound AI Systems: A Survey [64.4]
複合AIシステムでは、LLMコール、レトリバー、コードインタプリタ、ツールなどのコンポーネントが相互接続される。近年の進歩により, LLM を用いたパラメータのエンドツーエンド最適化が可能となった。本稿では,複合AIシステムのLCMに基づく最適化の原理と動向について述べる。
論文参考訳（メタデータ） (Mon, 21 Oct 2024 18:06:25 GMT)

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions [47.7]
AutoKaggleは、コード実行と単体テストを組み合わせた反復的な開発プロセスを実装し、コードの正しさとロジックの整合性を保証する。データクリーニング、特徴工学、モデリングのための検証済み機能を含む汎用データサイエンスツールキットは、このソリューションの基礎を形成します。 AutoKaggleは、一般的なデータサイエンスパイプラインにおけるバリデーションレート0.85と総合スコア0.82を達成する。
論文参考訳（メタデータ） (Sun, 27 Oct 2024 12:44:25 GMT)
Kaggleのようなデータ分析の自動化。対象としているタスク（分析フェーズ）は「background understanding, preliminary exploratory data analysis, data cleaning (DC), in-depth exploratory data analysis, feature engineering (FE), and model building, validation, and prediction (MBVP).」で通常のAutoMLより広い、対象データはテーブルデータのよう。
「As our analysis relies on GPT-4o, which is trained on data available until October 2023, it includes most of the Classic Kaggle competitions.To evaluate the generalization capabilities of AutoKaggle, we therefore focus on competitions initiated after 2024.」とLeakには気を使っているとはいえ、「Evaluation results demonstrate that AutoKaggle achieves a validation submission rate of 0.85 and a comprehensive score of 0.82 in typical data science pipelines, fully proving its effectiveness and practicality in handling complex data science tasks.」という言いきりは凄い。もっとも、今のLLMの性能からして適切なパイプラインを組めば解けそうな問題であるという感覚はある。
リポジトリはGitHub – multimodal-art-projection/AutoKaggle

Survey of User Interface Design and Interaction Techniques in Generative AI Applications

Survey of User Interface Design and Interaction Techniques in Generative AI Applications [79.6]
我々は,デザイナやディベロッパの参照として使用できる,さまざまなユーザインタラクションパターンのコンペレーションを作ることを目指している。また、生成AIアプリケーションの設計についてもっと学ぼうとする人たちの参入障壁を低くしようと努力しています。
論文参考訳（メタデータ） (Mon, 28 Oct 2024 23:10:06 GMT)
生成AIを使うアプリケーションのUIについてまとめたサーベイ
珍しいサーベイ

Foundation Models for Remote Sensing and Earth Observation: A Survey

Foundation Models for Remote Sensing and Earth Observation: A Survey [101.8]
本調査は、リモートセンシング基礎モデル(RSFM)の新しい分野を体系的にレビューする。モチベーションと背景の概要から始まり、続いて基本概念が導入された。その後、データセットや技術貢献を含む既存のRSFM研究を分類し、レビューする。
論文参考訳（メタデータ） (Tue, 22 Oct 2024 01:08:21 GMT)
Remote Sensing (RS) Foundation Modelのサーベイ

ChatGPT search, Gemini Grounding with Google Search, GPT-4o System Card, Baichuan Alignment Technical Report

前者はCHatGPTとWEB検索の融合で、今までも出たり消えたり、Pluginで使えたりしていた機能の公式メジャーアップデートとの認識。有用な機能であることは間違いなく、著作権との関係を解決しながら進んでいくものだと思う。

後者はWEB検索を通じてFact Chechkingを行う仕組みの提供。研究・OSSとも様々なものがあるが、有効なことが知られている。使いやすい仕組みが整備されるのはありがたい。

その他、GPT-4oのシステムカードやBaichuanのテクニカルレポートがarXivに投稿されていた。これらの情報も興味深い。

GPT-4o System Card [211.9]
GPT-4oは自動回帰オムニモデルであり、テキスト、オーディオ、画像、ビデオの組み合わせを入力として受け入れる。テキスト、ビジョン、オーディオでエンドツーエンドにトレーニングされており、すべての入力と出力は同じニューラルネットワークで処理される。 GPT-4は、英語とコードのテキスト上でのTurboのパフォーマンスと一致し、非英語のテキストでは大幅に改善された。
論文参考訳（メタデータ） (Fri, 25 Oct 2024 17:43:01 GMT)

Baichuan Alignment Technical Report [42.0]
ベイチュアン・アライメント(Baichuan Alignment)は、ベイチュアン級数のモデルで用いられるアライメント手法の詳細な解析である。プロセスは、Prompt Augmentation System (PAS)、Supervised Fine-Tuning (SFT)、Preference Alignmentの3つの主要なステージにまたがる。 Baichuan-Instructはコア機能を大幅に改善し、ユーザエクスペリエンスは17%から28%に向上した。
論文参考訳（メタデータ） (Sat, 19 Oct 2024 02:07:33 GMT)

A Survey on Automatic Credibility Assessment of Textual Credibility Signals in the Era of Large Language Models [6.5]
信頼性評価は基本的に、信頼性信号の集約に基づくものである。信頼性信号はより粒度が高く、説明が容易で、広く活用可能な情報を提供する。信頼性の自動評価と信頼性信号の検出に関する研究の活発化は、高度に断片化され相互相互接続が欠如しているとして特徴付けられる。
論文参考訳（メタデータ） (Mon, 28 Oct 2024 17:51:08 GMT)
信頼性評価に関するサーベイ。最初のニュースにかかわるような話も多く、研究はとても盛ん。

ComPO: Community Preferences for Language Model Personalization

ComPO: Community Preferences for Language Model Personalization [122.5]
ComPOは、言語モデルにおける好みの最適化をパーソナライズする手法である。 ComPRedはRedditからコミュニティレベルの好みを持った質問応答データセットです。
論文参考訳（メタデータ） (Mon, 21 Oct 2024 14:02:40 GMT)
言語モデルをpersonalizationする手法COMPOの提案。「Our proposed community preference optimization incorporates subreddit-specific contexts into the model, tailoring outputs to align with the distinct norms and values of individual communities.」というアプローチ。
リポジトリはGitHub – allenai/compred: Reddit Community Preferences

Claude 3.5 Sonnet, Haiku, Computer use, Aya Expanse

先週の話題で大きかったのはAnthropicによる Claude 3.5 Sonnetの強化とPC（GUI）を操作するエージェントの発表だった。

Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku \ Anthropic

前者はOpusを名乗らなかったのが注目で、さらなる高精度なモデルが用意されているとすると期待が大きい。後者はAgent S: An Open Agentic Framework that Uses Computers Like a Human – arXiv最新論文の紹介などのようにGUIを使うアプローチが良いのか、OS-COPILOT/FRIDAY (Fully Responsive Intelligence, Devoted to Assisting You)とUFO（UI-Focused） – arXiv最新論文の紹介のAPI（コード）を介するアプローチが良いのかは議論が分かれるところだが、この手の進化には要注目である。

Cohereから出ている多言語モデルAyaにも要注目。Aya Expanse: Connecting Our World

GemmaやLlama、Mistral以上を主張するモデルでCC-BY NCで公開されている。CohereForAI/aya-expanse-8b · Hugging Face、CohereForAI/aya-expanse-32b · Hugging Face

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

Scaling Diffusion Language Models via Adaptation from Autoregressive Models [105.7]
拡散言語モデル(DLM)は、テキスト生成モデルのための将来性のある新しいパラダイムとして登場した。 170Mから7BまでのARモデルをDiffuGPTとDiffuLLaMAの拡散モデルに変換し、200B未満のトークンでトレーニングできることを示す。実験の結果,これらのモデルは初期のDLMよりも優れており,ARと競合していることがわかった。
論文参考訳（メタデータ） (Wed, 23 Oct 2024 14:04:22 GMT)
「Building on existing DLMs, we present a recipe for scaling DLMs by continuing training on off-the shelf autoregressive LLMs.」、Diffusion Language Modelが有望かは議論が分かれるところだとは思うが面白い手法。DiffuLLaMAはautoregressive modelと競合するとのこと。
リポジトリはGitHub – HKUNLP/DiffuLLaMA: DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models

2025年6月
月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30