LoRA vs Full Fine-tuning: An Illusion of Equivalence

LoRA vs Full Fine-tuning: An Illusion of Equivalence [76.1]
本研究では, 異なる微調整法が, スペクトル特性のレンズを用いてモデルの重み行列を解析することにより, 事前学習モデルを変化させる方法について検討した。単一値分解が全く異なる構造を持つ全微調整およびLoRA収量行列が得られた。イントルーダ次元がLoRAの微調整モデルになぜ現れるのか、なぜそれらが望ましくないのか、そしてどのようにしてその効果を最小化できるかを検討することで結論を下す。
論文参考訳（メタデータ） (Mon, 28 Oct 2024 17:14:01 GMT)
LoRAで得られたWeightとファインチューニングで得られたWeightの差異を分析、「More specifically, we first show that the weight matrices trained with LoRA have new, high-ranking singular vectors, which we call intruder dimensions. Intruder dimensions do not appear during full fine-tuning. Second, we show that LoRA models with intruder dimensions, despite achieving similar performance to full fine-tuning on the target task, become worse models of the pre-training distribution and adapt less robustly to multiple tasks sequentially. ：とのこと。
興味深い性質であると思うのと、頑健性を評価するのは大変なので問題が見過ごされやすそうなのが若干怖い。

コメントを残す

コメントを残す コメントをキャンセル