2025年1月17日 – arXiv最新論文の紹介

The Tabular Foundation Model TabPFN Outperforms Specialized Time Series Forecasting Models Based on Simple Features [40.2]
本稿では,TabPFNと単純な特徴工学を組み合わせ,予測性能を高めるための簡単なアプローチであるTabPFN-TSを提案する。その単純さとわずか1100万のパラメータにもかかわらず、TabPFN-TSは類似サイズのモデルであるChronos-Miniよりも優れており、65倍のパラメータを持つChronos-Largeよりもわずかに優れている。
論文参考訳（メタデータ） (Mon, 06 Jan 2025 11:38:19 GMT)
なかなか難しい感のあるTabular Foundation Modelの提案。「By using a simple set of timestampderived features, our approach matches or slightly outperforms Chronos-T5 (Large), which, to our knowledge, is one of the strongest time series foundation models.」とのこと。時系列データの基礎的な動きを捉えられているのかもしれないが、使う場合はそのドメインでの検証はした方が良いのだろうなと思う。
リポジトリはGitHub – PriorLabs/tabpfn-client: ⚡ Easy API access to the tabular foundation model TabPFN ⚡

Automated Self-Refinement and Self-Correction for LLM-based Product Attribute Value Extraction [51.5]
本稿では,製品属性値抽出タスクに対して,エラーベースのプロンプト書き換えと自己補正という2つの自己補正手法を適用した。実験の結果、どちらの自己補充技術も、異なるシナリオでモデルの性能に限界的な影響しか与えず、処理コストは大幅に増加することがわかった。
論文参考訳（メタデータ） (Thu, 02 Jan 2025 12:55:27 GMT)
「information extraction tasks such as extracting product attribute values from product descriptions」タスクにおいてSelf-refinementやSelf-correctionの効果が薄く、「Overall, fine-tuning without self-refinement proves to be the most effective and cost-efficient approach for scenarios where attribute values need to be extracted from a large number of product descriptions.」との報告。有効なことも多いテクニックなので、タスクによりけりなのかな、という印象。
リポジトリはGitHub – wbsg-uni-mannheim/SelfRefinement4ExtractGPT: Automated Self-Refinement and Self-Correction for LLM-based Product Attribute Value Extraction