Myriad: multi-modal model by applying vision experts for industrial anomaly detection

Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection [82.2]
産業異常検出に視覚専門家を適用した新しい大規模マルチモーダルモデル(Myriad)を提案する。具体的には,MiniGPT-4をベースLMMとして採用し,Large Language Models (LLM) に理解可能なトークンとして,視覚専門家の事前知識を埋め込むために,Expert Perceptionモジュールを設計する。視覚専門家の誤りや混乱を補うために,一般画像と産業画像の視覚的表現ギャップを埋めるために,ドメインアダプタを導入する。
論文参考訳（メタデータ） (Sun, 29 Oct 2023 16:49:45 GMT)
たまに思う略称が厳しい感じの報告、multi-modal model by applying vision experts for industrial anomaly detectionとのこと…
成果は「Experiments show that our proposed Myriad not only achieves superior performance than both vision experts and state-of-the-art methods, but also provide detailed description for industrial anomaly detection.」で異常検知時に説明が出るのは重要。
リポジトリはGitHub – tzjtatata/Myriad: Open-sourced codes, IAD vision-language datasets and pre-trained checkpoints for Myriad.

コメントを残す

コメントを残す コメントをキャンセル