Sora 2, Claude Sonnet 4.5, GLM-4.6, DeepSeek v3.2-exp, HunyuanImage 3.0

先週の大きなニュースはOpenAIによるSora 2.0の発表だった（Sora 2 is here | OpenAI）。ビデオ生成モデルには様々なタスクを解ける可能性（Video models are zero-shot learners and reasoners – arXiv最新論文の紹介）やWorld modelとしての可能性（V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning – arXiv最新論文の紹介、SimVS: Simulating World Inconsistencies for Robust View Synthesis – arXiv最新論文の紹介、How Far is Video Generation from World Model: A Physical Law Perspective – arXiv最新論文の紹介など）が指摘されていてニュースリリースの中にも言及がある。

AnthropicのClaude Sonnet 4.5も発表されている（Introducing Claude Sonnet 4.5 \ Anthropic）。着実な進化と言えそうな結果。

GLM-4.6: Advanced Agentic, Reasoning and Coding Capabilities、deepseek-ai/DeepSeek-V3.2-Exp · Hugging Faceなど公開モデルのアップデートも要注目。GitHub – Tencent-Hunyuan/HunyuanImage-3.0: HunyuanImage-3.0: A Powerful Native Multimodal Model for Image GenerationについてはarXivに論文が公開されていた。

HunyuanImage 3.0 Technical Report [108.4]
HunyuanImage 3.0は、自動回帰フレームワーク内でのマルチモーダル理解と生成を統合する、ネイティブなマルチモーダルモデルである。 HunyuanImage 3.0は、これまでで最大かつ最も強力なオープンソース画像生成モデルである。
論文参考訳（メタデータ） (Sun, 28 Sep 2025 16:14:10 GMT)
非常に強力な画像系公開モデル
モデルはtencent/HunyuanImage-3.0 · Hugging Face

コメントを残すコメントをキャンセル