NVIDIA Nemotron Parse 1.1 / Nemotron-Flash – arXiv最新論文の紹介

NVIDIA Nemotron Parse 1.1 [52.6]
Nemotron-Parse-1.1は軽量な文書解析とOCRモデルである。一般的なOCR、マークダウンフォーマット、構造化テーブル解析、画像、チャート、ダイアグラムからのテキスト抽出など、改善された機能を提供する。我々は、より広範なNemotron-VLM-v2データセットの一部として、トレーニングデータのサブセットとともに、Huggingfaceのモデルウェイトと最適化されたNIMコンテナを公開しています。
論文参考訳（メタデータ） (Tue, 25 Nov 2025 16:41:25 GMT)
「Nemotron-Parse-1.1 follows an encoder-decoder architecture with 885M parameters, including a compact 256M-parameter language decoder.」というOCR関連モデル。（タスクにフィットしているということもあるのだろうが）decoder onlyではない。
リポジトリはnvidia/NVIDIA-Nemotron-Parse-v1.1-TC · Hugging Face

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models [97.6]
本研究の目的は、SLMのリアルタイムレイテンシの主要な決定要因を特定し、SLMの設計とトレーニングのための一般化可能な原則と方法論を提供することである。我々はNemotron-Flashと呼ばれるハイブリッドSLMの新たなファミリーを導入し、最先端SLMの精度・効率のフロンティアを大幅に向上させる。
論文参考訳（メタデータ） (Mon, 24 Nov 2025 08:46:36 GMT)
「Rather than merely offering a smaller LLM, this work re-imagines small models from the perspective of real- world latency and throughput, systematically explor- ing the key architectural and training factors essential for developing latency-optimal SLMs. By analyzing optimal depth–width ratios, strategically combining efficient attention operators through an evolutionary search framework, and enhancing training with weight normalization and meta tokens, we establish a comprehensive framework that significantly improves both real-device latency and accuracy, and deliver the Nemotron-Flash model family that advances the SOTA accuracy–latency frontier.」とアーキテクチャ設計に踏み込んでのSLMの探求
リポジトリはnvidia/Nemotron-Flash-3B · Hugging Face

コメントを残す コメントをキャンセル