2024年9月24日 – arXiv最新論文の紹介

Autoregressive + Chain of Thought (CoT) ≃ Recurrent、To CoT or not to CoT

Chain of Thoughtの検証を行った論文が出ていた。１つ目は動作面からの検証で2つ目はメタ分析によるもの。

Autoregressive + Chain of Thought (CoT) $\simeq$ Recurrent: Recurrence’s Role in Language Models and a Revist of Recurrent Transformer [30.0]
言語モデルにおける繰り返し構造が推論能力に与える影響について検討する。線形変換器やRWKVのようなモデルにおける重要な理論的限界を同定する。
論文参考訳（メタデータ） (Sat, 14 Sep 2024 00:30:57 GMT)
「We explained that CoT approximates recurrence in Transformer-based autoregressive LLMs from a computational standpoint.」とのこと。途中の「Recurrent Neural Networks (RNNs) sacrifice parallel training for recurrent connections, while Transformers trade recurrence for parallelism.」も重要。

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning [55.5]
Chain-of-Thought (CoT) は,大規模言語モデル (LLM) から推論能力を引き出すデファクト手法である。私たちは、CoTが主に数学や論理学を含むタスクに強いパフォーマンス上の利点をもたらし、他のタスクよりもはるかに少ない利益をもたらすことを示しています。
論文参考訳（メタデータ） (Wed, 18 Sep 2024 17:55:00 GMT)
「Finding 1: CoT only helps substantially on problems requiring mathematical, logical, or algorithmic reasoning.」はよいとして、「Finding 2: CoT primarily helps with the execution step that performs computation and symbolic manipulation, but falls short of what LLMs with tool augmentation can do.」はAgenticなアプローチのほうが有望ということなんだろうか。

P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task [94.1]
Embodied Everyday Taskは、インボディードAIコミュニティで人気のあるタスクである。自然言語命令は明示的なタスクプランニングを欠くことが多い。タスク環境に関する知識をモデルに組み込むには、広範囲なトレーニングが必要である。
論文参考訳（メタデータ） (Tue, 17 Sep 2024 15:29:34 GMT)
自然言語の指示と環境情報が与えられた時のエージェント動作（計画など）にRAGを使うアプローチの提案。RAGのデータベースを動的に更新していくものでLLM based Agentsそのものの印象。
感覚的にRetrieveに難しさがありそうだが、「When an agent interacts with the environment during a task, it first receives the environment’s goal instruction 𝐼𝑔 and observation 𝑂𝑡. Then it encodes with MiniLM [31] both of them」とあるがこの方針でうまくいくのかという驚き。

Recent Trends of Multimodal Affective Computing: A Survey from NLP Perspective [15.6]
マルチモーダル感情コンピューティング(MAC)は、人間の行動や意図の分析に広く応用されているため、注目を集めている。本調査は,NLPの観点からのマルチモーダル感情コンピューティングの最近のトレンドを4つのホットタスクにまとめる。本調査の目的は、マルチモーダル感情研究の現在の展望を探求し、開発動向を特定し、様々なタスクにおける類似点と相違点を明らかにすることである。
論文参考訳（メタデータ） (Wed, 11 Sep 2024 16:24:06 GMT)
Multimodal affective computingのサーベイ。主なタスクはMultimodal Sentiment Analysis (MSA), Multimodal Emotion Recognition in Conversation (MERC), Multimodal Aspect Based Sentiment Analysis (MABSA), Multimodal Multilabel Emotion Recognition (MMER)とのこと。
論文リポジトリはGitHub – LeMei/Multimodal-Affective-Computing-Survey