StepFunによるディープリサーチエージェントと評価ベンチマークの提案。「Experimental results demonstrate that Step-DeepResearch, with only 32B parameters, achieves a high score of 61.4% on the Scale AI Research Rubrics. In expert human evaluations on ADR-Bench, its Elo score significantly outperforms comparable models and rivals state-of-the-art closed-source models such as OpenAI DeepResearch and Gemini DeepResearch.」と高性能を主張。実行にはAPI接続が必要でこれもclosedでは?と思わなくもない。。