2024年4月10日 – arXiv最新論文の紹介

Gecko: Versatile Text Embeddings Distilled from Large Language Models [32.1]
本稿では,コンパクトで汎用的なテキスト埋め込みモデルであるGeckoを紹介する。私たちは、大きな言語モデル(LLM)から知識をレトリバーに抽出する、という重要なアイデアを活用しています。 MTEB (Massive Text Embedding Benchmark) では、256の埋め込み次元を持つ Gecko が 768 の埋め込みサイズで既存のエントリを上回ります。
論文参考訳（メタデータ） (Fri, 29 Mar 2024 17:56:40 GMT)
コンパクトかつ強力なテキスト埋め込みモデル。text-embedding-ada-3をoutperform。「Gecko is trained on an LLM-generated synthetic dataset FRet that contains LLM-ranked positives and negatives.」という形でLLMを活用

On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons [37.3]
多言語デコーダを用いた言語モデル(PLM)のニューロンレベルの内部挙動の解析言語固有のニューロンは、言語間でわずかに重なり(5%)、ユニークであることを示す。推論中に各モデルにおける全ニューロンの1%未満をタンパし、少数の言語特異的ニューロンとのタンパリングがテキスト生成におけるターゲット言語発生の確率を劇的に変化させることを実証した。
論文参考訳（メタデータ） (Wed, 03 Apr 2024 03:37:22 GMT)
PLMにおける多言語性の分析、「The experimental results demonstrate that language-specific neurons mainly exist in the first and last few layers, regardless of the language, model size, and model variants.」というFindingsはLanguage-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models – arXiv最新論文の紹介 (devneko.jp)など他の結果と整合的であるように思える。Controlling Language-specific Neuronsでの「In other words, the desired language could be generated by intentionally igniting target neurons.」は面白い。