Steel Surface Defect Detection Algorithm Based on Dual Attention Mechanism Fusion with YOLOv12

Authors

  • Yajie Ouyang Jiangxi University of Finance and Economics, Nanchang, 330013, China
  • Neng Wen Jiangxi University of Finance and Economics, Nanchang, 330013, China
  • Miaoling Miao Jiangxi University of Finance and Economics, Nanchang, 330013, China

DOI:

https://doi.org/10.54097/jab1g959

Keywords:

Steel defect detection, YOLOv12, Computer vision, Coordinate attention

Abstract

To address the strong reliance on traditional manual vision and the frequent missed detection of subtle defects against complex backgrounds in steel surface defect detection, this paper proposes an improved detection algorithm—CA-ViT-YOLOv12—based on dual attention mechanism fusion. Built upon the YOLOv12 framework, the algorithm innovatively integrates a Coordinate Attention (CA) module and a Vision Transformer (ViT) module. The CA module dynamically captures critical features across spatial and channel dimensions, enhancing precise localization of minute flaws; the ViT module leverages self-attention to establish global semantic dependencies, effectively overcoming the limited receptive field bottleneck of conventional convolutional neural networks and improving robustness under multi-texture interference and illumination variation. Experiments on a dataset comprising 5,000 real-world steel surface images from diverse scenarios show that the CA-ViT-YOLOv12 fusion model achieves a mean Average Precision (mAP@0.5) of 0.809 across all categories—significantly outperforming baseline methods—and provides reliable foundational algorithmic support for automated, high-precision quality inspection in steel production lines.

Downloads

Download data is not yet available.

References

[1] Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448. DOI: https://doi.org/10.1109/ICCV.2015.169

[2] Wang Y, Wang X, Hao R, et al. Metal surface defect detection method based on improved cascade r-cnn[J]. Journal of Computing and Information Science in Engineering, 2024, 24(4): 041002. DOI: https://doi.org/10.1115/1.4063860

[3] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788. DOI: https://doi.org/10.1109/CVPR.2016.91

[4] Li D, Wang E, Li Z, et al. STE-YOLO: A surface defect detection algorithm for steel strips[J]. Electronics, 2024, 14(1): 54. DOI: https://doi.org/10.3390/electronics14010054

[5] Zhou Y, Zhao Z. MPA-YOLO: Steel surface defect detection based on improved YOLOv8 framework[J]. Pattern Recognition, 2025, 168: 111897. DOI: https://doi.org/10.1016/j.patcog.2025.111897

[6] Zhang L, Wang Z, Ma Y, et al. Steel surface defect detection algorithm based on improved YOLOv10[J]. Scientific Reports, 2025, 15(1): 32827. DOI: https://doi.org/10.1038/s41598-025-16725-8

[7] Wang B, Wang M, Yang J, et al. YOLOv5-CD: Strip steel surface defect detection method based on coordinate attention and a decoupled head[J]. Measurement: Sensors, 2023, 30: 100909. DOI: https://doi.org/10.1016/j.measen.2023.100909

[8] Li Z, Wu C, Han Q, et al. CASI-Net: A novel and effect steel surface defect classification method based on coordinate attention and self-interaction mechanism[J]. Mathematics, 2022, 10(6): 963. DOI: https://doi.org/10.3390/math10060963

[9] Fan J, Ling X, Liang J. Detection of surface defects of steel plate based on vit[C]//Journal of physics: conference series. IOP Publishing, 2021, 2002(1): 012039. DOI: https://doi.org/10.1088/1742-6596/2002/1/012068

[10] Liu G, Chen Y, Ye J, et al. A transformer neural network based framework for steel defect detection under complex scenarios[J]. Advances in Engineering Software, 2025, 202: 103872. DOI: https://doi.org/10.1016/j.advengsoft.2025.103872

[11] Wu S, Yang H, Liao L, et al. DSAT: a dynamic sparse attention transformer for steel surface defect detection with hierarchical feature fusion[J]. Scientific Reports, 2025, 15(1): 29198. DOI: https://doi.org/10.1038/s41598-025-14935-8

[12] Li Y, Han Z, Wang W, et al. Steel surface defect detection based on sparse global attention transformer[J]. Pattern Analysis and Applications, 2024, 27(4): 152. DOI: https://doi.org/10.1007/s10044-024-01375-9

[13] Guo Z, Wang C, Yang G, et al. Msft-yolo: Improved yolov5 based on transformer for detecting defects of steel surface[J]. Sensors, 2022, 22(9): 3467. DOI: https://doi.org/10.3390/s22093467

[14] Wu S, Yang H, Liao L, et al. SH-DETR: Enhancing steel surface defect detection and classification with an improved transformer architecture[J]. PLoS One, 2025, 20 (11): e0334048. DOI: https://doi.org/10.1371/journal.pone.0334048

[15] Pan W, Zhong R, Huang J, et al. DEENet: an edge-enhanced CNN–Transformer dual-encoder model for steel surface defect detection[J]. Scientific Reports, 2026. DOI: https://doi.org/10.1038/s41598-026-36390-9

Downloads

Published

26-06-2026

How to Cite

Ouyang, Y., Wen, N., & Miao, M. (2026). Steel Surface Defect Detection Algorithm Based on Dual Attention Mechanism Fusion with YOLOv12. Highlights in Science, Engineering and Technology, 163, 84-96. https://doi.org/10.54097/jab1g959