{"id":8288,"date":"2026-03-11T03:17:00","date_gmt":"2026-03-10T18:17:00","guid":{"rendered":"https:\/\/devneko.jp\/wordpress\/?p=8288"},"modified":"2026-02-28T15:20:31","modified_gmt":"2026-02-28T06:20:31","slug":"sumtablets-a-transliteration-dataset-of-sumerian-tablets","status":"publish","type":"post","link":"https:\/\/devneko.jp\/wordpress\/?p=8288","title":{"rendered":"SumTablets: A Transliteration Dataset of Sumerian Tablets\u00a0"},"content":{"rendered":"\n<ul class=\"wp-block-list\">\n<li><strong>SumTablets: A Transliteration Dataset of Sumerian Tablets\u00a0<\/strong>[28.7]<br>SumTablets \u306f Unicode \u8868\u73fe\u3092 91,606 \u3067\u7d44\u307f\u5408\u308f\u305b\u305f\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3067\u3042\u308b\u3002 \u79c1\u305f\u3061\u306f\u3001Hugging Face\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3068\u3057\u3066SumTablets\u3092\u30ea\u30ea\u30fc\u30b9\u3057\u3001GitHub\u7d4c\u7531\u3067\u30aa\u30fc\u30d7\u30f3\u30bd\u30fc\u30b9\u306e\u30c7\u30fc\u30bf\u6e96\u5099\u30b3\u30fc\u30c9\u3092\u4f5c\u6210\u3057\u307e\u3057\u305f\u3002 \u6211\u3005\u306e\u5fae\u8abf\u6574\u8a00\u8a9e\u30e2\u30c7\u30eb\u306f\u5e73\u5747\u6587\u5b57\u30ec\u30d9\u30ebF\u30b9\u30b3\u30a2(chrF)97.55\u3092\u9054\u6210\u3059\u308b\u3002<br><a href=\"http:\/\/arxiv.org\/abs\/2602.22200v1\">\u8ad6\u6587<\/a>\u00a0\u00a0<a href=\"https:\/\/fugumt.com\/fugumt\/paper_check\/2602.22200v1\">\u53c2\u8003\u8a33\uff08\u30e1\u30bf\u30c7\u30fc\u30bf\uff09<\/a>\u00a0 \u00a0(Wed, 25 Feb 2026 18:50:42 GMT)<\/li>\n\n\n\n<li>\u300cthe absence of a comprehensive, accessible dataset pairing transliterations with a digital representation of the tablet\u2019s cuneiform glyphs has prevented the application of modern Natural Language Processing (NLP) methods to the task of Sumerian transliteration.  To address this gap, we present SumTablets, a dataset pairing Unicode representations of 91,606 Sumerian cuneiform tablets (totaling 6,970,407 glyphs) with the associated transliterations published by Oracc.\u300d\u3068\u3044\u3046\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u3002<\/li>\n\n\n\n<li>\u30ea\u30dd\u30b8\u30c8\u30ea\u306f<a href=\"https:\/\/github.com\/colesimmons\/SumTablets\">GitHub &#8211; colesimmons\/SumTablets: SumTablets is a dataset designed for training Sumerian transliteration models.<\/a>\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306f<a href=\"https:\/\/huggingface.co\/datasets\/colesimmons\/SumTablets\">colesimmons\/SumTablets \u00b7 Datasets at Hugging Face<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[491],"class_list":["post-8288","post","type-post","status-publish","format-standard","hentry","category-arxiv","tag-491"],"_links":{"self":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/8288","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8288"}],"version-history":[{"count":1,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/8288\/revisions"}],"predecessor-version":[{"id":8289,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/8288\/revisions\/8289"}],"wp:attachment":[{"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8288"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8288"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devneko.jp\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8288"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}