2025年3月7日 – arXiv最新論文の紹介

Toward Robust Non-Transferable Learning: A Survey and Benchmark

Toward Robust Non-Transferable Learning: A Survey and Benchmark [51.5]
非伝達学習(NTL)は、ディープラーニングモデルの一般化能力を再構築することを目的とした課題である。 NTLの性能とロバスト性を評価する最初のベンチマークであるNTLBenchを紹介する。我々はNTLの実践的応用と今後の方向性と課題について論じる。
論文参考訳（メタデータ） (Wed, 19 Feb 2025 10:12:19 GMT)
「Its goal is to prevent the model’s generalization to specific target domains or tasks (such as harmful [Rosati et al , 2024; Huang et al , 2024b] or unauthorized domains [Wang et al , 2022b; Si et al , 2024]) while preserving its normal functionality on a source domain.」を目的とするNon-Transferable Learningのサーベイ。
ベンチマークを公開予定とのこと。GitHub – tmllab/NTLBench

Shh, don’t say that! Domain Certification in LLMs [124.6]
大きな言語モデル(LLM)は狭いドメインで制約されたタスクを実行するためにしばしばデプロイされる。ドメイン認証は、言語モデルのドメイン外動作を正確に特徴付ける保証である。次に, 逆境界を証明として提供するVALIDを, 単純かつ効果的なアプローチとして提案する。
論文参考訳（メタデータ） (Wed, 26 Feb 2025 17:13:19 GMT)
任意の入力がある状況下で狙ったドメイン以外の回答をしないようにする手法、Verified Adversarial LLM Output via Iterative Dismissal (VALID)の提案。