2024年12月3日 – arXiv最新論文の紹介

DRS: Deep Question Reformulation With Structured Output

DRS: Deep Question Reformulation With Structured Output [114.1]
大規模な言語モデルは、質問の解答不能を識別するが、質問の修正を支援する能力は欠如している。 DRS:Deep Question Reformulation with Structured Outputを提案する。提案手法は, GPT-3.5 の修正精度を 23.03% から 70.42% に向上させ, Gemma2-9B などのオープンソースの大規模言語モデルのスコアを 26.35% から 56.75% に向上させる。
論文参考訳（メタデータ） (Wed, 27 Nov 2024 02:20:44 GMT)
質問を修正する手法の提案。「More importantly, according to Faustini et al (2023), in a large-scale industrial experiment,rephrasing unanswerable questions posed to virtual assistants significantly enhances the user experience for millions, which highlights the importance of effectively leveraging LLMs to assist people in question reformulation.」とも書かれているが、応用上ほしい場面があるのは確か。この論文ではentity extraction, dfs combination search with question generation, final candidate selectionと問題を分割しながら特殊法を提案している。
リポジトリはGitHub – Lizhecheng02/DRS: Repository for our paper “DRS: Deep Question Reformulation With Structured Output”.

Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision [120.4]
本稿では、推論と批判モデルの役割を分離する2人プレイヤパラダイムを提案する。まず、批判データを収集する自動化およびスケーラブルなフレームワークであるAutoMathCritiqueを提案する。テスト時間における難解なクエリに対するアクターのパフォーマンスを,批判モデルが一貫して改善することが実証された。
論文参考訳（メタデータ） (Mon, 25 Nov 2024 17:11:54 GMT)
「flawed reasoning path construction, critique generation, and data filtering」の3ステージからなるフレームワークAutoMathCritiqueでデータを構築、fine tuningするとともに、「Motivated by the insights of test-time, we introduce the critique model into the actor model’s exploration and learning process, introducing a critique-in-the-loop self-improvement method」を適用して効果を確認。 critique modelの有効性が分かる結果に見える（が、この構築は容易ではないかもしれない）
リポジトリはAutoMathCritique

Training and Evaluating Language Models with Template-based Data Generation [6.0]
我々は、700万以上の合成された小学校数学問題からなるデータセットを作成する。このデータセットは、数学的推論においてLLMを事前学習、微調整、評価するための貴重なリソースとして機能する。
論文参考訳（メタデータ） (Wed, 27 Nov 2024 07:32:56 GMT)
LLMにメタテンプレート作成からまかせての合成データ構築。面白いけど他分野でもワークする可能性はあるのだろうか。
リポジトリはGitHub – iiis-ai/TemplateMath: Official implementation of “Training and Evaluating Language Models with Template-based Data Generation” (https://templatemath.github.io)