Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models

Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models [105.0]
我々は,MAETと命名された多言語能力抽出と伝達手法を提案する。我々のキーとなる考え方は、大きな言語モデルから言語に依存しない能力に関する重みを分解し抽出することである。実験結果から,MAETは高度能力の抽出と伝達を効果的に行うことができ,トレーニングベースライン法よりも優れることがわかった。
論文参考訳（メタデータ） (Thu, 10 Oct 2024 11:23:18 GMT)
「Our key idea is to decompose and extract language-agnostic ability-related weights from LLMs, and transfer them across different languages by simple addition and subtraction operations without training.」という多言語能力の抽出とそのモデルマージ手法、MEAT: Multi-lingual Ability Extraction and Transfer approachを提案。「Our approach MAET achieves better performance than the competitive baseline methods (e g , continual pre-training and model merging with task vector) in multi-lingual complex reasoning tasks, including mathematical reasoning tasks and scientific reasoning tasks.」とのこと。
リポジトリはhttps://github.com/RUCAIBox/MAET

コメントを残す

コメントを残す コメントをキャンセル