Mixtral of Experts – arXiv最新論文の紹介

Mixtral of Experts [57.4]
Mixtral 8x7Bはスパース・ミックス・オブ・エキスパートズ(SMOE)言語モデルである。 Mixtralは数学、コード生成、多言語ベンチマークでLlama 270Bをはるかに上回っている。また、GPT-3.5 Turbo、Claude-2.1、Gemini Pro、Llama 2 70Bを超越したMixtral 8x7B – Instructという命令に従うように微調整されたモデルも提供する。
論文参考訳（メタデータ） (Mon, 8 Jan 2024 18:47:34 GMT)
高性能で話題になったMixtralの論文。「Surprisingly, we do not observe obvious patterns in the assignment of experts based on the topic.」は驚き
Mixtral of experts | Mistral AI | Open-weight models

コメントを残す

コメントを残す コメントをキャンセル