2025年1月14日 – arXiv最新論文の紹介

Long Context vs. RAG for LLMs: An Evaluation and Revisits

Long Context vs. RAG for LLMs: An Evaluation and Revisits [41.3]
本稿は、このトピックに関する最近の研究を再考し、その重要な洞察と相違点を明らかにする。 LCは、特にウィキペディアベースの質問に対して、質問応答ベンチマークにおいてRAGよりも優れていた。また,既存の研究における文脈関連性の重要性を概観する,詳細な議論もおこなう。
論文参考訳（メタデータ） (Fri, 27 Dec 2024 14:34:37 GMT)
Revisiting In-Context Learning with Long Context Language Models – arXiv最新論文の紹介に近いが、Long Context vs RAGの検証。「The results indicate that LC generally outperforms RAG for tasks involving wellstructured, dense contexts—such as Wikipedia articles and books—and is better at answering questions requiring specific information.　By contrast, RAG demonstrates advantages in handling fragmented information, particularly in dialogue-based scenarios and for more general questions.」と一長一短。
これでOKと断言しにくい結果ではあるが、幅広い検証がとても参考になる。
リポジトリはGitHub – lixinze777/LC_VS_RAG: Offcial Page for Long Context vs. RAG for LLMs: An Evaluation and Revisits

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM [89.5]
スロー思考推論システムは、推論中の思考時間をスケールすることで、広く注目を集めている。マルチモーダル大規模言語モデル(MLLM)への適応にも関心が高まっている。本稿では,少量のテキスト長文思考データを用いて,有能なMLLMを微調整することで,簡単なアプローチを探索する。自然言語で表現されたこれらの長文推論プロセスは,MLLMに効果的に転送できることがわかった。
論文参考訳（メタデータ） (Fri, 03 Jan 2025 17:14:16 GMT)
o1-likeな推論に時間をかけるアプローチがMLLMにおいても有効であるとの報告。それはそうなんだろうと思うが、猛追という感じ。
リポジトリはGitHub – RUCAIBox/Virgo: Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking [15.4]
本稿では,小型言語モデル (SLM) が OpenAI o1 の算術的推論能力に匹敵するか,超越するかを示すために rStar-Math を提案する。我々はモンテカルロ木探索(MCTS)を通して「深層思考」を実践し,SLMに基づくプロセス報酬モデルによるテスト時間探索を行う。
論文参考訳（メタデータ） (Wed, 08 Jan 2025 14:12:57 GMT)
「In this work, we present rStar-Math, a self-evolved System 2 deep thinking approach that significantly boosts the math reasoning capabilities of small LLMs, achieving state-of-the-art OpenAI o1-level performance.」と流行りのアプローチ、self-evolvedという表現に未来を感じるとともに、比較的小規模なモデルでも高いスコアをとれていることが興味深い
リポジトリはhttps://github.com/microsoft/rStar。現時点では404？