KAN – arXiv最新論文の紹介

Beyond KAN: Introducing KarSein for Adaptive High-Order Feature Interaction Modeling in CTR Prediction

Beyond KAN: Introducing KarSein for Adaptive High-Order Feature Interaction Modeling in CTR Prediction [35.5]
Kolmogorov-Arnold Represented Sparse Interaction Network (KarSein)を紹介する。 KarSeinは予測精度と計算効率の両方を最適化するように設計されている。計算オーバーヘッドを最小限に抑えて、かなりの予測精度を達成する。
論文参考訳（メタデータ） (Fri, 16 Aug 2024 12:51:52 GMT)
KANを拡張し、CTR予測に適用
リポジトリはGitHub – Ancientshi/KarSein: KarSein for CTR predict

KAN or MLP: A Fairer Comparison [63.8]
本稿では,様々なタスクにおけるkanとモデルの比較を,より公平かつ包括的に行う。パラメータ数とFLOPを制御して,kanの性能と表現性を比較する。我々は,KANSAの課題が,標準クラス増分学習環境において忘れることよりも深刻であることが確認された。
論文参考訳（メタデータ） (Tue, 23 Jul 2024 17:43:35 GMT)
以前話題にあったKAN: Kolmogorov-Arnold Networks – arXiv最新論文の紹介 (devneko.jp)とMLPの比較、「We found that KAN can be seen as a special type of MLP, with its uniqueness stemming from the use of learnable B-spline functions as activation functions.」、「Our main observation is that, except for symbolic formula representation tasks, MLP generally outperforms KAN.」と評価。

KAN: Kolmogorov-Arnold Networks [16.8]
MLP(Multi-Layer Perceptrons)の代替として、KAN(Kolmogorov-Arnold Networks)を提案する。カンはエッジ上で学習可能なアクティベーション機能を持つ(“weights”)。この一見単純な変化により、KANSAは精度と解釈可能性という点で、ニューラルネットワークを上回ります。
論文参考訳（メタデータ） (Tue, 30 Apr 2024 17:58:29 GMT)
MLPよりも性能・解釈可能性が優れていると主張する構造の提案。「KANs and MLPs are dual: KANs have activation functions on edges, while MLPs have activation functions on nodes. This simple change makes KANs better (sometimes much better!) than MLPs in terms of both model accuracy and interpretability.」とのこと。現時点では「Currently, the biggest bottleneck of KANs lies in its slow training. KANs are usually 10x slower than MLPs, given the same number of parameters.」という記載もあるが、本当かつ広く受け入れられるのだろうか。。
リポジトリはGitHub – KindXiaoming/pykan: Kolmogorov Arnold Networks