1.重庆大学 数学与统计学院,重庆 401331
2.长春长光辰谱科技有限公司,吉林 长春 130000
[ "牛小明(1983-),男,博士研究生,高级工程师,2008年于重庆大学获得硕士学位,现为中国兵器装备集团自动化研究所部门技术总师,主要从事机器视觉目标检测、缺陷检测、OCR识别、光电目标识别及跟踪、语音识别、声纹识别、语义理解和机器人等领域的研究。E-mail: niuxiaoming1@163.comn" ]
[ "曾 理(1959-),男,四川郫县人,博士,教授,博士生导师,1997年于重庆大学获得博士学位,主要从事工业CT重建、图像处理等方面的研究。E-mail: drlizeng@cqu.edu.cn" ]
牛小明, 曾理, 杨飞, 等. 零部件光学影像精准定位的轻量化深度学习网络[J]. 光学精密工程, 2023,31(17):2611-2625. DOI: 10.37188/OPE.20233117.2611.
NIU Xiaoming, ZENG Li, YANG Fei, et al. Lightweight deep learning network for accurate localization of optical image components[J]. Optics and Precision Engineering, 2023,31(17):2611-2625. DOI: 10.37188/OPE.20233117.2611.
光学影像精准定位是提高工业生产效率和质量的重要环节。传统图像处理定位方法由于光照、噪声等环境因素的影响,在复杂场景下定位精度低、易受干扰;而经典深度学习网络虽然在自然场景目标检测、工业安检、抓取、缺陷检测等得到了广泛应用,但是其海量数据的训练需求、复杂系统的深度学习大模型、检测框的冗余及不精确等问题,导致它不能直接应用于工业零部件像素级精准定位。针对以上问题,构建了一种零部件光学影像像素级精准定位的轻量化深度学习网络方法。网络总体选用Encoder-Decoder架构,Encoder使用三级bottleneck级联,在降低特征提取参变量的同时充分提升了网络的非线性;Encoder与Decoder对应特征层实施融合拼接,促使Encoder在上采样卷积后可以获得更多的高分辨率信息,进而更完备地重建出原始图像细节信息;最后,利用加权的Hausdorff距离构建了Decoder输出层与定位坐标点的关系。实验结果表明:轻量化深度学习定位网络模型参数为57.4 kB,定位精度小于等于5 pixel的识别率大于等于99.5%,基本满足工业零部件定位精度高、准确率高和抗干扰能力强等要求。
Precise optical image localization is crucial for improving industrial production efficiency and quality. Traditional image processing and localization methods have low accuracy and are vulnerable to environmental factors such as lighting and noise in complex scenes. Although classical deep learning networks have been widely applied in natural-scene object detection, industrial inspection, grasping, defect detection, and other areas, directly applying pixel-level precise localization to industrial components is still challenging owing to the requirements of massive data training, complex deep learning models, and redundant and imprecise detection boxes. To address these issues, this paper proposes a lightweight deep learning network approach for pixel-level accurate localization of component optical images. The overall design of the network adopts an Encoder–Decoder architecture. The Encoder incorporates a three-level bottleneck cascade to reduce the parameter complexity of feature extraction while enhancing the network’s nonlinearity. The Encoder and Decoder perform feature layer fusion and concatenation, enabling the Encoder to obtain more high-resolution information after upsampling convolution and to reconstruct the original image details more comprehensively. Finally, the weighted Hausdorff distance is utilized to establish the relationship between the Decoder's output layer and the localization coordinates. Experimental results demonstrate that the lightweight deep learning localization network model has a parameter size of 57.4 kB, and the recognition rate for localization accuracy less than or equal to 5 pixels is greater than or equal to 99.5%. Thus, the proposed approach satisfies the requirements of high localization accuracy, high precision, and strong anti-interference capabilities for industrial component localization.
machine visionoptical imagedeep learningprecise localizationlightweight
