1.重庆大学 数学与统计学院,重庆 401331
2.长春长光辰谱科技有限公司,吉林 长春 130000
[ "牛小明(1983-),男,博士研究生,高级工程师,2008年于重庆大学获得硕士学位,现为中国兵器装备集团自动化研究所部门技术总师,主要从事机器视觉目标检测、缺陷检测、OCR识别、光电目标识别及跟踪、语音识别、声纹识别、语义理解和机器人等领域的研究。E-mail: niuxiaoming1@163.comn" ]
[ "曾 理(1959-),男,四川郫县人,博士,教授,博士生导师,1997年于重庆大学获得博士学位,主要从事工业CT重建、图像处理等方面的研究。E-mail: drlizeng@cqu.edu.cn" ]
扫 描 看 全 文
牛小明, 曾理, 杨飞, 等. 零部件光学影像精准定位的轻量化深度学习网络[J]. 光学精密工程, 2023,31(17):2611-2625.
NIU Xiaoming, ZENG Li, YANG Fei, et al. Lightweight deep learning network for accurate localization of optical image components[J]. Optics and Precision Engineering, 2023,31(17):2611-2625.
牛小明, 曾理, 杨飞, 等. 零部件光学影像精准定位的轻量化深度学习网络[J]. 光学精密工程, 2023,31(17):2611-2625. DOI: 10.37188/OPE.20233117.2611.
NIU Xiaoming, ZENG Li, YANG Fei, et al. Lightweight deep learning network for accurate localization of optical image components[J]. Optics and Precision Engineering, 2023,31(17):2611-2625. DOI: 10.37188/OPE.20233117.2611.
光学影像精准定位是提高工业生产效率和质量的重要环节。传统图像处理定位方法由于光照、噪声等环境因素的影响,在复杂场景下定位精度低、易受干扰;而经典深度学习网络虽然在自然场景目标检测、工业安检、抓取、缺陷检测等得到了广泛应用,但是其海量数据的训练需求、复杂系统的深度学习大模型、检测框的冗余及不精确等问题,导致它不能直接应用于工业零部件像素级精准定位。针对以上问题,构建了一种零部件光学影像像素级精准定位的轻量化深度学习网络方法。网络总体选用Encoder-Decoder架构,Encoder使用三级bottleneck级联,在降低特征提取参变量的同时充分提升了网络的非线性;Encoder与Decoder对应特征层实施融合拼接,促使Encoder在上采样卷积后可以获得更多的高分辨率信息,进而更完备地重建出原始图像细节信息;最后,利用加权的Hausdorff距离构建了Decoder输出层与定位坐标点的关系。实验结果表明:轻量化深度学习定位网络模型参数为57.4 kB,定位精度小于等于5 pixel的识别率大于等于99.5%,基本满足工业零部件定位精度高、准确率高和抗干扰能力强等要求。
Precise optical image localization is crucial for improving industrial production efficiency and quality. Traditional image processing and localization methods have low accuracy and are vulnerable to environmental factors such as lighting and noise in complex scenes. Although classical deep learning networks have been widely applied in natural-scene object detection, industrial inspection, grasping, defect detection, and other areas, directly applying pixel-level precise localization to industrial components is still challenging owing to the requirements of massive data training, complex deep learning models, and redundant and imprecise detection boxes. To address these issues, this paper proposes a lightweight deep learning network approach for pixel-level accurate localization of component optical images. The overall design of the network adopts an Encoder–Decoder architecture. The Encoder incorporates a three-level bottleneck cascade to reduce the parameter complexity of feature extraction while enhancing the network’s nonlinearity. The Encoder and Decoder perform feature layer fusion and concatenation, enabling the Encoder to obtain more high-resolution information after upsampling convolution and to reconstruct the original image details more comprehensively. Finally, the weighted Hausdorff distance is utilized to establish the relationship between the Decoder's output layer and the localization coordinates. Experimental results demonstrate that the lightweight deep learning localization network model has a parameter size of 57.4 kB, and the recognition rate for localization accuracy less than or equal to 5 pixels is greater than or equal to 99.5%. Thus, the proposed approach satisfies the requirements of high localization accuracy, high precision, and strong anti-interference capabilities for industrial component localization.
机器视觉光学影像深度学习精准定位轻量化
machine visionoptical imagedeep learningprecise localizationlightweight
高贯斌, 牛锦鹏, 刘飞, 等. 基于各向异性误差相似度的六自由度机器人定位误差补偿[J]. 光学 精密工程, 2022, 30(16): 1955-1967. doi: 10.37188/ope.20223016.1955http://dx.doi.org/10.37188/ope.20223016.1955
GAO G B, NIU J P, LIU F, et al. Positioning error compensation of 6-DOF robots based on anisotropic error similarity[J]. Opt. Precision Eng., 2022, 30(16): 1955-1967.(in Chinese). doi: 10.37188/ope.20223016.1955http://dx.doi.org/10.37188/ope.20223016.1955
任同群, 江海川, 张建昆, 等. 微小磁性零件装配设备自动标定及误差补偿[J]. 光学 精密工程, 2023, 31(2): 214-225. doi: 10.37188/ope.20233102.0214http://dx.doi.org/10.37188/ope.20233102.0214
REN T Q, JIANG H CH, ZHANG J K, et al. Automatic calibration and error compensation for micro magnetic parts assembly equipment[J]. Opt. Precision Eng., 2023, 31(2): 214-225.(in Chinese). doi: 10.37188/ope.20233102.0214http://dx.doi.org/10.37188/ope.20233102.0214
陈平,雷学军,李灿, 等. 3D位姿估计结合阻抗控制的沉孔装配[J]. 光学 精密工程,2022,30(22): 2889-2900. doi: 10.37188/ope.20223022.2889http://dx.doi.org/10.37188/ope.20223022.2889
CHEN P, LEI X J, LI C, et al. Assembly of countersunk-hole parts based on 3D pose estimation and impedance control [J]. Opt. Precision Eng., 2022,30(22): 2889-2900. (in Chinese). doi: 10.37188/ope.20223022.2889http://dx.doi.org/10.37188/ope.20223022.2889
潘睿志,林涛,李超, 等.基于深度学习的多尺寸汽车轮辋焊缝检测与定位系统研究[J]. 光学 精密工程,2023,31(8): 1174-1187. doi: 10.37188/OPE.20233108.1174http://dx.doi.org/10.37188/OPE.20233108.1174
PAN R ZH, LIN T, LI CH, et al. A research on multi size automobile rim weld detection and positioning system based on depth learning [J]. Opt. Precision Eng., 2023,31(8): 1174-1187. (in Chinese). doi: 10.37188/OPE.20233108.1174http://dx.doi.org/10.37188/OPE.20233108.1174
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. doi: 10.1145/3065386http://dx.doi.org/10.1145/3065386
GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C].2014 IEEE Conference on Computer Vision and Pattern Recognition. 2328,2014, Columbus, OH, USA. IEEE, 2014: 580-587. doi: 10.1109/CVPR.2014.81http://dx.doi.org/10.1109/CVPR.2014.81
HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. doi: 10.1109/tpami.2015.2389824http://dx.doi.org/10.1109/tpami.2015.2389824
GIRSHICK R. Fast R-CNN[C].2015 IEEE International Conference on Computer Vision (ICCV).713,2015, Santiago, Chile. IEEE, 2016: 1440-1448. doi: 10.1109/iccv.2015.169http://dx.doi.org/10.1109/iccv.2015.169
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/tpami.2016.2577031http://dx.doi.org/10.1109/tpami.2016.2577031
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C].2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2730,2016, Las Vegas, NV, USA. IEEE, 2016: 779-788. doi: 10.1109/cvpr.2016.91http://dx.doi.org/10.1109/cvpr.2016.91
LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot MultiBox Detector[M].Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 21-37. doi: 10.1007/978-3-319-46448-0_2http://dx.doi.org/10.1007/978-3-319-46448-0_2
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C].2017 IEEE International Conference on Computer Vision (ICCV).2229,2017, Venice, Italy. IEEE, 2017: 2980-2988. doi: 10.1109/iccv.2017.324http://dx.doi.org/10.1109/iccv.2017.324
HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C].2017 IEEE International Conference on Computer Vision (ICCV).2229,2017, Venice, Italy. IEEE, 2017: 2961-2969. doi: 10.1109/iccv.2017.322http://dx.doi.org/10.1109/iccv.2017.322
CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C].2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1823,2018, Salt Lake City, UT, USA. IEEE, 2018: 6154-6162. doi: 10.1109/cvpr.2018.00644http://dx.doi.org/10.1109/cvpr.2018.00644
ZHOU X Y, ZHUO J C, KRÄHENBÜHL P. Bottom-up object detection by grouping extreme and center points[C].2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).1520,2019, Long Beach, CA, USA. IEEE, 2020: 850-859. doi: 10.1109/cvpr.2019.00094http://dx.doi.org/10.1109/cvpr.2019.00094
LI Y H, CHEN Y T, WANG N Y, et al. Scale-aware trident networks for object detection[C].2019 IEEE/CVF International Conference on Computer Vision (ICCV). October 27 - November 2, 2019, Seoul, Korea (South). IEEE, 2020: 6054-6063. doi: 10.1109/iccv.2019.00615http://dx.doi.org/10.1109/iccv.2019.00615
ZHU C C, HE Y H, SAVVIDES M. Feature selective anchor-free module for single-shot object detection[C].2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).1520,2019, Long Beach, CA, USA. IEEE, 2020: 840-849. doi: 10.1109/cvpr.2019.00093http://dx.doi.org/10.1109/cvpr.2019.00093
TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C].2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).1319,2020, Seattle, WA, USA. IEEE, 2020: 10781-10790. doi: 10.1109/cvpr42600.2020.01079http://dx.doi.org/10.1109/cvpr42600.2020.01079
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. 2020: arXiv: 2004.10934. https://arxiv.org/abs/2004.10934https://arxiv.org/abs/2004.10934.
SUN P Z, ZHANG R F, JIANG Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C].2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2025,2021, Nashville, TN, USA. IEEE, 2021: 14454-14463. doi: 10.1109/cvpr46437.2021.01422http://dx.doi.org/10.1109/cvpr46437.2021.01422
BEAL J, KIM E, TZENG E, et al. Toward transformer-based object detection[EB/OL]. 2020: arXiv: 2012.09958. https://arxiv.org/abs/2012.09958https://arxiv.org/abs/2012.09958.
QIU H, MA Y, LI Z, et al. BorderDet: border feature for dense object detection[EB/OL]. 2020: arXiv: 2007.11056. https://arxiv.org/abs/2007.11056https://arxiv.org/abs/2007.11056. doi: 10.1007/978-3-030-58452-8_32http://dx.doi.org/10.1007/978-3-030-58452-8_32
YANG H, DENG R, LU Y, et al. CircleNet: anchor-free detection with circle representation[EB/OL]. 2020: arXiv: 2006.02474. https://arxiv.org/abs/2006.02474https://arxiv.org/abs/2006.02474. doi: 10.1007/978-3-030-59719-1_4http://dx.doi.org/10.1007/978-3-030-59719-1_4
CHEN Y H, ZHANG Z, CAO Y, et al. RepPoints v2: verification meets regression for object detection[C].Proceedings of the 34th International Conference on Neural Information Processing Systems. December 6 - 12, 2020, Vancouver, BC, Canada. New York: ACM, 2020: 5621-5631.
YAO Z, AI J, LI B, et al. Efficient DETR: improving end-to-end object detector with dense prior.[EB/OL]. 2021: arXiv: 2104.01318. https://arxiv.org/abs/2104.01318https://arxiv.org/abs/2104.01318. doi: 10.48550/arXiv.2104.01318http://dx.doi.org/10.48550/arXiv.2104.01318
LANG S, VENTOLA F, KERSTING K. DAFNe: a one-stage anchor-free approach for oriented object detection.[EB/OL]. 2021: arXiv: 2109.06148. https://arxiv.org/abs/2109.06148https://arxiv.org/abs/2109.06148.
GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. 2021: arXiv: 2107.08430. https://arxiv.org/abs/2107.08430https://arxiv.org/abs/2107.08430.
YU R, DU D W, LALONDE R, et al. Cascade transformers for end-to-end person search[C].2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).1824,2022, New Orleans, LA, USA. IEEE, 2022: 7267-7276. doi: 10.1109/cvpr52688.2022.00712http://dx.doi.org/10.1109/cvpr52688.2022.00712
CHENG G, WANG J B, LI K, et al. Anchor-free oriented proposal generator for object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-11. doi: 10.1109/tgrs.2022.3183022http://dx.doi.org/10.1109/tgrs.2022.3183022
HUANG Z C, LI W, XIA X G, et al. A general Gaussian heatmap label assignment for arbitrary-oriented object detection[J]. IEEE Transactions on Image Processing, 2022, 31: 1895-1910. doi: 10.1109/tip.2022.3148874http://dx.doi.org/10.1109/tip.2022.3148874
PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation[EB/OL]. 2016: arXiv: 1606.02147. https://arxiv.org/abs/1606.02147https://arxiv.org/abs/1606.02147. doi: 10.48550/arXiv.1606.02147http://dx.doi.org/10.48550/arXiv.1606.02147
DUBUISSON M P, JAIN A K. A modified Hausdorff distance for object matching[C].Proceedings of 12th International Conference on Pattern Recognition. 913,1994, Jerusalem, Israel. IEEE, 2002: 566-568.
SCHUTZE O, ESQUIVEL X, LARA A, et al. Using the averaged Hausdorff distance as a performance measure in evolutionary multiobjective optimization[J]. IEEE Transactions on Evolutionary Computation, 2012, 16(4): 504-522. doi: 10.1109/tevc.2011.2161872http://dx.doi.org/10.1109/tevc.2011.2161872
乔健,陈能达,伍雁雄, 等.融合注意力机制的金属锅圆柱表面缺陷检测[J]. 光学 精密工程,2023,31(3): 404-416. doi: 10.37188/OPE.20233103.0404http://dx.doi.org/10.37188/OPE.20233103.0404
QIAO J, CHEN N D, WU Y X, et al. Defect detection of cylindrical surface of metal pot combining attention mechanism[J]. Opt. Precision Eng., 2023,31(3): 404-416.(in Chinese). doi: 10.37188/OPE.20233103.0404http://dx.doi.org/10.37188/OPE.20233103.0404
任凤雷,杨璐,周海波, 等.基于改进BiSeNet的实时图像语义分割[J]. 光学 精密工程,2023,31(8): 1217-1227. doi: 10.37188/OPE.20233108.1217http://dx.doi.org/10.37188/OPE.20233108.1217
REN F L, YANG L, ZHOU H B, et al. Real-time semantic segmentation based on improved BiSeNet[J]. Opt. Precision Eng., 2023,31(8): 1217-1227.(in Chinese). doi: 10.37188/OPE.20233108.1217http://dx.doi.org/10.37188/OPE.20233108.1217
0
浏览量
17
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构