2024年10月24日 – arXiv最新論文の紹介

Fundamental Limitations on Subquadratic Alternatives to Transformers

Fundamental Limitations on Subquadratic Alternatives to Transformers [3.5]
文書類似性タスクに重点を置いており、入力された多くの文書として与えられ、最もよく似たペアを見つけたいと思っています。我々はTransformerがこのタスクを実行できることを証明し、このタスクはどんなアルゴリズムでも真に2次時間で実行できないことを証明した。
論文参考訳（メタデータ） (Sat, 05 Oct 2024 19:21:13 GMT)
「We focus on document similarity tasks, where one is given as input many documents and would like to ﬁnd a pair which is (approximately) the most similar. We prove that Transformer is able to perform this task, and we prove that this task cannot be performed in truly subquadratic time by any algorithm.」という主張。
その手のタスクがあるのはそうだろうというのとドキュメント類似性タスクに関する分析はとても興味深い。特に「Theorem 3.1. Assuming SETH or OVC, for every ε > 0, there exists a constant c > 0 such that γ-LSDn,ℓ cannot be solved in O(n^2−ε) time for any γ ≥ 1 when ℓ = c log n.」は面白い結果。（実用上は、というと話が変わる場合も多い印象ではありつつ）この手の理論解析は重要。

How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs [69.6]
本稿では,変圧器を用いた大規模言語モデルの数学的タスクにおける有効性に影響を与える重要な要因として,数値的精度を同定する。その結果,数値精度の低いトランスフォーマーでは,繰り返し加算や整数乗算などの算術的なタスクに対処できないことがわかった。対照的に、標準的な数値精度のトランスフォーマーは、モデルサイズを大幅に小さくすることで、これらのタスクを効率的に処理することができる。
論文参考訳（メタデータ） (Thu, 17 Oct 2024 17:59:35 GMT)
「Our results show that Transformers operating with low numerical precision fail to address arithmetic tasks, such as iterated addition and integer multiplication, unless the model size grows super-polynomially with respect to the input length.」という指摘。

Mamba in Vision: A Comprehensive Survey of Techniques and Applications [3.5]
Mambaは、コンピュータビジョンにおいて、畳み込みニューラルネットワーク(CNN)とビジョントランスフォーマー(ViT)が直面する課題を克服するための、新しいアプローチとして登場した。 MambaはSelective Structured State Space Modelsを活用して、線形計算の複雑さで長距離依存を効果的に捉えることで、これらの制限に対処する。
論文参考訳（メタデータ） (Fri, 04 Oct 2024 02:58:49 GMT)
画像におけるMamba活用のサーベイ
リポジトリはGitHub – maklachur/Mamba-in-Computer-Vision: Mamba in Vision: A Comprehensive Survey of Techniques and Applications