{"id":5358,"date":"2024-08-28T04:02:00","date_gmt":"2024-08-27T19:02:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=5358"},"modified":"2024-08-28T04:02:00","modified_gmt":"2024-08-27T19:02:00","slug":"jpeg-lm-llms-as-image-generators-with-canonical-codec-representations","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=5358","title":{"rendered":"JPEG-LM: LLMs as Image Generators with Canonical Codec Representations"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>JPEG-LM: LLMs as Image Generators with Canonical Codec Representations\u00a0<\/strong>[51.1]<br>\u96e2\u6563\u5316\u306f\u3001\u753b\u50cf\u3084\u30d3\u30c7\u30aa\u306e\u3088\u3046\u306a\u9023\u7d9a\u3057\u305f\u30c7\u30fc\u30bf\u3092\u96e2\u6563\u30c8\u30fc\u30af\u30f3\u3068\u3057\u3066\u8868\u73fe\u3059\u308b\u3002 \u753b\u50cf\u3084\u30d3\u30c7\u30aa\u3092\u8b58\u5225\u3059\u308b\u4e00\u822c\u7684\u306a\u65b9\u6cd5\u306f\u3001\u751f\u306e\u30d4\u30af\u30bb\u30eb\u5024\u306e\u30e2\u30c7\u30ea\u30f3\u30b0\u3067\u3042\u308b\u3002 \u00a0\u672c\u7814\u7a76\u3067\u306f,\u753b\u50cf\u3084\u30d3\u30c7\u30aa\u3092\u76f4\u63a5,\u6a19\u6e96\u30b3\u30fc\u30c7\u30c3\u30af(JPEG,AVC\/H.264)\u3092\u4ecb\u3057\u3066\u30b3\u30f3\u30d4\u30e5\u30fc\u30bf\u4e0a\u306b\u4fdd\u5b58\u3057\u305f\u5727\u7e2e\u30d5\u30a1\u30a4\u30eb\u3068\u3057\u3066\u30e2\u30c7\u30eb\u5316\u3059\u308b\u3053\u3068\u3092\u63d0\u6848\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2408.08459v2\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2408.08459v2\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Wed, 21 Aug 2024 00:24:53 GMT)<\/li>\n\n\n\n<li>JPEG\u3092\u76f4\u63a5\u6271\u3048\u308bL(?)M\u306e\u63d0\u6848\u3002\u300cFor generality, our models also do not use any vision-specific modules like convolutions or 2D positional embeddings, potentially making the task more challenging.\u300d\u3001\u300cHowever, we observe that conventional, vanilla language modeling surprisingly conquers these challenges without special designs as training goes (e g , JPEG-LM generates realistic images barely with any corrupted JPEG patches).\u300d\u3068\u306e\u3053\u3068\u3002\u30a2\u30fc\u30ad\u30c6\u30af\u30c1\u30e3\u306f7B Llama-2 model\u3001\u672c\u5f53\u306b\u5f37\u529b\u3002<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[565,615],"class_list":["post-5358","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-565","tag-615"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/5358","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5358"}],"version-history":[{"count":0,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/5358\/revisions"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5358"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5358"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5358"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}