{"id":5206,"date":"2024-07-29T06:32:00","date_gmt":"2024-07-28T21:32:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=5206"},"modified":"2024-07-29T06:32:00","modified_gmt":"2024-07-28T21:32:00","slug":"llama-3-1-mistral-large2-ai-models-collapse-when-trained-on-recursively-generated-data","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=5206","title":{"rendered":"Llama 3.1, Mistral Large2, AI models collapse when trained on recursively generated data"},"content":{"rendered":"\n<p>Llama3.1\u304c\u767a\u8868\u3055\u308c\u305f\u3002\u5546\u7528\u30e2\u30c7\u30eb\u306b\u8ffd\u3044\u3064\u3044\u305f\u516c\u958b\u30e2\u30c7\u30eb\u3067\u3042\u308a\u610f\u7fa9\u306f\u975e\u5e38\u306b\u5927\u304d\u3044\u3002\u975e\u5546\u7528\u5229\u7528\u306e\u307f\u3067\u3042\u308b\u304c\u3001Mistral\u3082\u5f37\u529b\u306a\u30e2\u30c7\u30ebMistral Large2\u3092\u516c\u958b\u3057\u3066\u3044\u308b\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/ai.meta.com\/blog\/meta-llama-3-1\/\">Introducing Llama 3.1: Our most capable models to date (meta.com)<\/a>\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/huggingface.co\/collections\/meta-llama\/llama-31-669fc079a0c406a149a5738f\">Llama 3.1 &#8211; a meta-llama Collection (huggingface.co)<\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><a href=\"https:\/\/mistral.ai\/news\/mistral-large-2407\/\">Large Enough | Mistral AI | Frontier AI in your hands<\/a>\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/huggingface.co\/mistralai\/Mistral-Large-Instruct-2407\">mistralai\/Mistral-Large-Instruct-2407 \u00b7 Hugging Face<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>Llama 3.1\u306e\u5b66\u7fd2\u3067\u306f\u7279\u306bSFT\u7528\u30c7\u30fc\u30bf\u3068\u3057\u3066\u5408\u6210\u30c7\u30fc\u30bf\u304c\u3046\u307e\u304f\u7528\u3044\u3089\u308c\u3066\u3044\u308b\u3088\u3046\u3002\u307e\u305f\u3001\u300cFor example, to ensure Llama 3 is not accidentally over\ufb01tted on commonly used benchmarks, our pre-training data was procured and processed by a separate team that was strongly incentivized to prevent contamination of that pre-training data with external benchmarks.\u300d\u3068\u3044\u3046\u6307\u6458\u3082\u5370\u8c61\u7684\u3060\u3063\u305f\u3002<\/p>\n\n\n\n<p>\u4e0a\u8a18\u3068\u306f\u82e5\u5e72\u8ad6\u70b9\u304c\u7570\u306a\u308b\u6c17\u3082\u3059\u308b\u304c\u3001<a href=\"https:\/\/www.nature.com\/articles\/s41586-024-07566-y\">AI models collapse when trained on recursively generated data | Nature<\/a>\u3067\u306f\u300c\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u306b\u304a\u3051\u308b\u30e2\u30c7\u30eb\u751f\u6210\u30b3\u30f3\u30c6\u30f3\u30c4\u306e\u7121\u5dee\u5225\u4f7f\u7528\u306f\u3001\u7d50\u679c\u306e\u30e2\u30c7\u30eb\u306b\u4e0d\u53ef\u9006\u7684\u306a\u6b20\u9665\u3092\u5f15\u304d\u8d77\u3053\u3059\u3002\u6211\u3005\u306f\u3001\u3053\u306e\u52b9\u679c\u3092\u300c\u30e2\u30c7\u30eb\u5d29\u58ca\u300d\u3068\u547c\u3073\u3001LLM\u3084\u5909\u5206\u30aa\u30fc\u30c8\u30a8\u30f3\u30b3\u30fc\u30c0\u3067\u8d77\u3053\u308a\u3046\u308b\u3053\u3068\u3092\u793a\u3059\u3002web\u304b\u3089\u53ce\u96c6\u3057\u305f\u5927\u898f\u6a21\u30c7\u30fc\u30bf\u304b\u3089\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u306e\u30e1\u30ea\u30c3\u30c8\u3092\u7dad\u6301\u3059\u308b\u305f\u3081\u306b\u306f,\u771f\u5263\u306b\u53d6\u308a\u7d44\u3080\u5fc5\u8981\u304c\u3042\u308b\u3053\u3068\u3092\u5b9f\u8a3c\u3059\u308b\u3002\u300d\u3068\u6307\u6458\u3057\u3066\u3044\u305f\u3002\u30c7\u30fc\u30bf\u5408\u6210\u306e\u60aa\u5f71\u97ff\u3001\u30e2\u30c7\u30eb\u5d29\u58ca\u306b\u3064\u3044\u3066\u306e\u6307\u6458\u3067\u3042\u308a\u8208\u5473\u6df1\u3044\u3002<\/p>\n\n\n\n<p>\u4e0b\u8a18\u306e\u3088\u3046\u306b\u901a\u5e38\u306e\u30c7\u30fc\u30bf\u3068\u5408\u6210\u30c7\u30fc\u30bf\u306e\u6df7\u5408\u306b\u3088\u3063\u3066\u30e2\u30c7\u30eb\u5d29\u58ca\u3092\u907f\u3051\u3089\u308c\u308b\u3068\u3044\u3046\u6307\u6458\u3082\u3042\u308b\u3002Data augmentation\u306e\u9650\u754c\u3001\u6a5f\u68b0\u7ffb\u8a33\u3060\u3068Back translation\u306e\u9650\u754c\u306e\u3088\u3046\u306b\u4e00\u5b9a\u4ee5\u4e0a\u306e\u6027\u80fd\u5411\u4e0a\u304c\u7121\u7406\u306a\u306e\u306f\u76f4\u89b3\u7684\u306b\u306f\u305d\u3046\u3060\u308d\u3046\u3068\u601d\u3046\u304c\u3001\u3069\u306e\u7a0b\u5ea6\u307e\u3067\u3044\u3051\u308b\u306e\u304b\u6c17\u306b\u306a\u308b\u3068\u3053\u308d\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data&nbsp;<\/strong>[49.7]<br>\u5404\u4e16\u4ee3\u306e\u5408\u6210\u30c7\u30fc\u30bf\u306b\u3088\u3063\u3066\u5143\u306e\u5b9f\u30c7\u30fc\u30bf\u3092\u7f6e\u304d\u63db\u3048\u308b\u3053\u3068\u306f\u3001\u30e2\u30c7\u30eb\u5d29\u58ca\u306e\u50be\u5411\u306b\u3042\u308b\u3053\u3068\u3092\u793a\u3059\u3002 \u751f\u6210\u3057\u305f\u5b9f\u30c7\u30fc\u30bf\u3068\u9023\u7d9a\u3059\u308b\u5408\u6210\u30c7\u30fc\u30bf\u306e\u84c4\u7a4d\u306f,\u30e2\u30c7\u30eb\u5d29\u58ca\u3092\u56de\u907f\u3059\u308b\u3053\u3068\u3092\u5b9f\u8a3c\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2404.01413v2\">\u8ad6\u6587<\/a>&nbsp;&nbsp;<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2404.01413v2\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>&nbsp; &nbsp;(Mon, 29 Apr 2024 23:13:42 GMT)<\/li>\n\n\n\n<li>\u5b9f\u8a3c\u5b9f\u9a13\u304a\u3088\u3073\u7dda\u8b66\u6212\u6a5f\u306b\u304a\u3044\u3066\u306f\u7406\u8ad6\u7684\u306b\u300cOur findings extend these prior works to show that if data accumulates and models train on a mixture of \u201creal\u201d and synthetic data, model collapse no longer occurs.\u300d\u3001\u300cTogether, these results strongly suggest that the \u201ccurse of recursion\u201d may not be as dire as had been portrayed \u2013 provided we accumulate synthetic data alongside real data, rather than replacing real data by synthetic data only.\u300d\u3068\u6307\u6458\u3002<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Llama3.1\u304c\u767a\u8868\u3055\u308c\u305f\u3002\u5546\u7528\u30e2\u30c7\u30eb\u306b\u8ffd\u3044\u3064\u3044\u305f\u516c\u958b\u30e2\u30c7\u30eb\u3067\u3042\u308a\u610f\u7fa9\u306f\u975e\u5e38\u306b\u5927\u304d\u3044\u3002\u975e\u5546\u7528\u5229\u7528\u306e\u307f\u3067\u3042\u308b\u304c\u3001Mistral\u3082\u5f37\u529b\u306a\u30e2\u30c7\u30ebMistral Large2\u3092\u516c\u958b\u3057\u3066\u3044\u308b\u3002 Llama 3.1\u306e\u5b66\u7fd2\u3067\u306f\u7279\u306bS &hellip; <a href=\"https:\/\/devneko.jp\/wordpress\/?p=5206\" class=\"more-link\"><span class=\"screen-reader-text\">&#8220;Llama 3.1, Mistral Large2, AI models collapse when trained on recursively generated data&#8221; \u306e<\/span>\u7d9a\u304d\u3092\u8aad\u3080<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[223,293,390],"class_list":["post-5206","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-llm","tag-oss","tag-synthetic-data"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/5206","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5206"}],"version-history":[{"count":0,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/5206\/revisions"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5206"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5206"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5206"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}