{"id":7965,"date":"2025-12-30T06:29:00","date_gmt":"2025-12-29T21:29:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=7965"},"modified":"2025-12-21T21:32:19","modified_gmt":"2025-12-21T12:32:19","slug":"frontiercs-evolving-challenges-for-evolving-intelligence","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=7965","title":{"rendered":"FrontierCS: Evolving Challenges for Evolving Intelligence"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>FrontierCS: Evolving Challenges for Evolving Intelligence\u00a0<\/strong>[174.8]<br>\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u79d1\u5b66\u306e\u69d8\u3005\u306a\u9818\u57df\u306b\u307e\u305f\u304c\u308b156\u306e\u30aa\u30fc\u30d7\u30f3\u30a8\u30f3\u30c9\u554f\u984c\u306e\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u3067\u3042\u308bFrontierCS\u3092\u7d39\u4ecb\u3059\u308b\u3002 \u5404\u554f\u984c\u306b\u5bfe\u3057\u3066\u3001\u5c02\u9580\u5bb6\u306e\u53c2\u7167\u30bd\u30ea\u30e5\u30fc\u30b7\u30e7\u30f3\u3068\u81ea\u52d5\u8a55\u4fa1\u5668\u3092\u63d0\u4f9b\u3059\u308b\u3002 \u79c1\u305f\u3061\u306f\u3001\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3068\u7814\u7a76\u306e\u30c8\u30e9\u30c3\u30af\u306b\u95a2\u3057\u3066\u3001\u30d5\u30ed\u30f3\u30c6\u30a3\u30a2\u63a8\u8ad6\u30e2\u30c7\u30eb\u304c\u4eba\u9593\u306e\u5c02\u9580\u5bb6\u3088\u308a\u305a\u3063\u3068\u9045\u308c\u3066\u3044\u308b\u3053\u3068\u306b\u6c17\u4ed8\u304d\u307e\u3057\u305f\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2512.15699v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2512.15699v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Wed, 17 Dec 2025 18:52:45 GMT)<\/li>\n\n\n\n<li>\u300cwe introduce FrontierCS, a coding benchmark that evaluates LLMs on solving open- ended computer science problems, where no known closed-form or deterministic optimal solution exists in practice.\u00a0\u300d\u3068\u3044\u3046\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u3002\u300cEmpirically, we find that even the strongest frontier reasoning models remain far behind human experts on both the algorithmic and research tracks of FrontierCS. Simply scaling up context length or reasoning budgets yields diminishing returns on the hardest problems, and models frequently converge to locally workable but clearly suboptimal algorithms.\u300d\u3068\u306e\u3053\u3068\u3002<\/li>\n\n\n\n<li>\u30d7\u30ed\u30b8\u30a7\u30af\u30c8\u30b5\u30a4\u30c8\u306f<a href=\"https:\/\/frontier-cs.org\/\">FrontierCS<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[517],"class_list":["post-7965","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-517"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7965","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7965"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7965\/revisions"}],"predecessor-version":[{"id":7966,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/7965\/revisions\/7966"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7965"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7965"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7965"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}