DeepseekによるHyper-Connectionsからの改善提案。「mHC yields comprehensive improvements, consistently outperforming the baseline and surpassing HC on the majority of tasks. Notably, compared to HC, mHC further enhances the model’s reasoning capabilities, delivering performance gains of 2.1% on BBH (Suzgun et al , 2022) and 2.3% on DROP (Dua et al , 2019).」と効果を確認。27Bと相応の規模で実験をしている点もさすがというところ。