Model Editing – arXiv最新論文の紹介

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning [90.0]
大規模言語モデル(LLM)は、ユーザプロンプトへの順守を、妥当な応答よりも優先する傾向がある。近年の研究では、教師付き微調整(SFT)を用いて、お世辞問題を軽減することが提案されている。そこで本研究では,特定の目的のために関心のあるモジュールを調整した新しいピンポイントチューニング(SPT)を提案する。
論文参考訳（メタデータ） (Tue, 03 Sep 2024 07:01:37 GMT)
「When challenged by users, LLMs tend to admit mistakes and provide inaccurate responses even if they initially provided the correct answer.」というSycophancyへ対応するためピンポイントなチューニングを適用
「The proposed pinpoint tuning consists of two steps: ➀: “diagnose” for where in the network attributes to the sycophancy; ➁: precisely optimize the pinpointed components to improve the performance.」とのことだが、いろいろ有効そうな場所が多そうなアプローチ

Is Bigger Edit Batch Size Always Better? — An Empirical Study on Model Editing with Llama-3 [2.6]
本研究では,最新の大言語モデルであるLlama-3に着目したターゲットモデル編集分析を行う。最大4096個の編集を対象とする評価により,最も効果的な編集層を同定する。
論文参考訳（メタデータ） (Wed, 01 May 2024 17:50:37 GMT)
Llama-3を対象としたモデル編集、出るのが速い・・・
「Contrary to previous belief, our experiments show that earlier layers may be more optimal intervention points, and that smaller, frequent sequential batch size edits have a superior performance in comparison to larger batch sizes.」、この手のテクニックはモデルが更新されるたび変わるのだろうか。。。

Model Editing Can Hurt General Abilities of Large Language Models [128.3]
大規模言語モデル(LLM)は、パラメータに格納された知識にアクセスするための新しいパラダイムを開放した。大規模言語モデル(LLM)の最近の進歩は、パラメータに格納された知識にアクセスするための新しいパラダイムを開放した。更新情報によるLLMの再学習は資源集約的であるため,モデル編集への関心が高まっている。
論文参考訳（メタデータ） (Tue, 9 Jan 2024 18:03:15 GMT)
モデル編集の副作用に関する検証、GPT2-XLとLlama-1-7Bを対象にKN、MEND、ROME、MEMITでModel Editingし、8タスクで評価。結果として性能はかなり劣化したとのこと。
結果としては納得感があり、これら技術を使うには当該分野のテスト方法を確立する必要がありそう。
リポジトリはJasonForJoy/Model-Editing-Hurt (github.com)