Multimodal C4 – arXiv最新論文の紹介

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text [104.0]
テキスト内ビジョンやFlamingoのような言語モデルは、任意のインターリーブされた画像とテキストのシーケンスを入力としてサポートする。このインターフェースをサポートするために、インターリーブされた画像+テキストを含むウェブコーパス上でプレトレーニングが行われる。我々はMultimodal C4 (mmc4) をリリースした。
論文参考訳（メタデータ） (Fri, 14 Apr 2023 06:17:46 GMT)
非常にありがたいマルチモーダルなデータセット。103Mドキュメント、585Mイメージ、43Btokenと大規模。
「mmc4 is constructed from public webpages contained in the cleaned English c4 corpus.」とのことで日本語はほぼ入っていなさそう・・・
プロジェクトサイトはGitHub – allenai/mmc4: MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.

コメントを残す

コメントを残す コメントをキャンセル