Is Bigger Edit Batch Size Always Better? — An Empirical Study on Model Editing with Llama-3

Is Bigger Edit Batch Size Always Better? — An Empirical Study on Model Editing with Llama-3 [2.6]
本研究では,最新の大言語モデルであるLlama-3に着目したターゲットモデル編集分析を行う。最大4096個の編集を対象とする評価により,最も効果的な編集層を同定する。
論文参考訳（メタデータ） (Wed, 01 May 2024 17:50:37 GMT)
Llama-3を対象としたモデル編集、出るのが速い・・・
「Contrary to previous belief, our experiments show that earlier layers may be more optimal intervention points, and that smaller, frequent sequential batch size edits have a superior performance in comparison to larger batch sizes.」、この手のテクニックはモデルが更新されるたび変わるのだろうか。。。

コメントを残す

コメントを残す コメントをキャンセル