The Belebele Benchmark – arXiv最新論文の紹介

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [82.6]
私たちは122の言語変種にまたがるデータセットであるBelebeleを紹介します。このデータセットは、高、中、低リソース言語におけるテキストモデルの評価を可能にする。
論文参考訳（メタデータ） (Thu, 31 Aug 2023 17:43:08 GMT)
「multiple-choice machine reading comprehension (MRC) dataset spanning 122 language variants.」ということで非常に多言語のMRCデータセット。機械翻訳におけるFLORES-200のような立ち位置で非常に貴重なデータセット
「GPT3.5-TURBO performs the best on the top 20 languages, but after 40-50, its performance falls far behind INFOXLM and XLM-V.」というベンチマーク結果が興味深い。商業システムはある程度ターゲットとなる言語を絞っているよう。
リポジトリはGitHub – facebookresearch/belebele: Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.

コメントを残す

コメントを残す コメントをキャンセル