浏览全部资源
扫码关注微信
沈阳航空航天大学 电子信息工程学院,辽宁 沈阳 110000
[ "张丽丽(1979-),女,黑龙江省讷河人,博士,副教授,硕士生导师,2002年、2005年、2012年于吉林大学分别获得学士、硕士、博士学位,主要从事FPGA系统设计及深度学习算法的研究。E-mail: 20052727@sau.edu.cn" ]
[ "陈 真(1996-),男,河南商丘人,硕士研究生,2020年于沈阳航空航天大学获得学士学位,主要从事深度学习以及FPGA的算法研究。E-mail: chenzhen_1996@qq.com" ]
收稿日期:2022-06-02,
修回日期:2022-07-14,
纸质出版日期:2023-02-25
移动端阅览
张丽丽,陈真,刘雨轩等.基于ZYNQ的 Yolo v3-SPP实时目标检测系统[J].光学精密工程,2023,31(04):543-551.
ZHANG Lili,CHEN Zhen,LIU Yuxuan,et al.Yolo v3-SPP real-time target detection system based on ZYNQ[J].Optics and Precision Engineering,2023,31(04):543-551.
张丽丽,陈真,刘雨轩等.基于ZYNQ的 Yolo v3-SPP实时目标检测系统[J].光学精密工程,2023,31(04):543-551. DOI: 10.37188/OPE.20233104.0543.
ZHANG Lili,CHEN Zhen,LIU Yuxuan,et al.Yolo v3-SPP real-time target detection system based on ZYNQ[J].Optics and Precision Engineering,2023,31(04):543-551. DOI: 10.37188/OPE.20233104.0543.
基于卷积神经网络的目标检测算法发展迅速,随着计算复杂度增加,对设备的性能及功耗要求越来越高。为了使目标检测算法能够部署在嵌入式设备上,本文采用软硬件协同设计方法,使用FPGA对算法进行硬件加速,提出了ZYNQ平台下的Yolo v3-SPP目标检测系统。本文将该系统部署在XCZU15EG芯片上,并对系统所需的功耗、硬件资源及性能进行了分析。首先对要部署的网络模型进行优化,并在Pascal VOC 2007数据集上进行训练,最后使用Vitis AI工具对训练后的模型进行量化、编译,使其适用于ZYNQ端的部署。为了选取最佳的配置方案,探究了各配置对硬件资源及系统性能的影响,从系统功耗(W)、检测速度(FPS)、各类别平均精度的平均值(mAP)、输出误差等方面对系统进行了分析。结果表明:在300 M时钟频率下,输入图片大小为(416,416)时,针对Yolo V3-SPP和Yolo V3-Tiny网络结构,检测速度分别为38.44 FPS和177FPS,mAP分别为80.35%和68.55%,片上芯片功耗为21.583 W,整板功耗23.02 W。满足嵌入式设备部署神经网络模型的低功耗、实时性、高检测精度等要求。
The target detection algorithm based on the convolutional neural network is developing rapidly, and with the increase in computational complexity, requirements for device performance and power consumption are increasing. To enable the target detection algorithm to be deployed on embedded devices, this study proposes a Yolo v3-SPP target detection system based on the ZYNQ platform by using a hardware and software co-design approach and hardware acceleration of the algorithm through FPGA. The system is deployed on the XCZU15EG chip, and the required power consumption, hardware resources, and performance of the system are analyzed. The network model to be deployed is first optimized and trained on the Pascal VOC 2007 dataset, and finally, the trained model is quantified and compiled using the Vitis AI tool to make it suitable for deployment on the ZYNQ platform. To select the best configuration scheme, the impact of each configuration on hardware resources and system performance is explored. The system power consumption (W), detection speed (FPS), mean value of average precision (mAP) for each category, output error, etc. are also analyzed. The experimental results show that the detection speed is 38.44 FPS and 177 FPS for Yolo V3-SPP and Yolo V3-Tiny network structures, respectively, with mAPs of 80.35% and 68.55%, on-chip power consumption of 21.583 W, and board power consumption of 23.02 W at 300 M clock frequency and input image size of (416,416). This shows that the proposed target detection system meets the requirements of embedded devices for deploying neural network models with low power consumption, real-time, and high detection accuracy.
唐悦 , 吴戈 , 朴燕 . 改进的GDT-YOLOV3目标检测算法 [J]. 液晶与显示 , 2020 , 35 ( 8 ): 852 - 860 . doi: 10.37188/yjyxs20203508.0852 http://dx.doi.org/10.37188/yjyxs20203508.0852
TANG Y , WU G , PIAO Y . Improved algorithm of GDT-YOLOV3 image target detection [J]. Chinese Journal of Liquid Crystals and Displays , 2020 , 35 ( 8 ): 852 - 860 . (in Chinese) . doi: 10.37188/yjyxs20203508.0852 http://dx.doi.org/10.37188/yjyxs20203508.0852
范丽丽 , 赵宏伟 , 赵浩宇 , 等 . 基于深度卷积神经网络的目标检测研究综述 [J]. 光学 精密工程 , 2020 , 28 ( 5 ): 1152 - 1164 .
FAN L L , ZHAO H W , ZHAO H Y , et al . Survey of target detection based on deep convolutional neural networks [J]. Opt. Precision Eng. , 2020 , 28 ( 5 ): 1152 - 1164 . (in Chinese)
鞠默然 , 罗海波 , 刘广琦 , 等 . 采用空间注意力机制的红外弱小目标检测网络 [J]. 光学 精密工程 , 2021 , 29 ( 4 ): 843 - 853 . doi: 10.37188/OPE.20212904.0843 http://dx.doi.org/10.37188/OPE.20212904.0843
JU M R , LUO H B , LIU G Q , et al . Infrared dim and small target detection network based on spatial attention mechanism [J]. Opt. Precision Eng. , 2021 , 29 ( 4 ): 843 - 853 . (in Chinese) . doi: 10.37188/OPE.20212904.0843 http://dx.doi.org/10.37188/OPE.20212904.0843
王宸 , 张秀峰 , 刘超 , 等 . 改进YOLOv3的轮毂焊缝缺陷检测 [J]. 光学 精密工程 , 2021 , 29 ( 8 ): 1942 - 1954 . doi: 10.37188/OPE.20212908.1942 http://dx.doi.org/10.37188/OPE.20212908.1942
WANG CH , ZHANG X F , LIU CH , et al . Detection method of wheel hub weld defects based on the improved YOLOv3 [J]. Opt. Precision Eng. , 2021 , 29 ( 8 ): 1942 - 1954 . (in Chinese) . doi: 10.37188/OPE.20212908.1942 http://dx.doi.org/10.37188/OPE.20212908.1942
WEI G J , HOU Y Z , CUI Q M , et al . YOLO acceleration using FPGA architecture [C]. 2018 IEEE/CIC International Conference on Communications in China (ICCC). 1618,2018 , Beijing, China. IEEE , 2019 : 734 - 735 . doi: 10.1109/iccchina.2018.8641256 http://dx.doi.org/10.1109/iccchina.2018.8641256
NAKAHARA H , YONEKAWA H , FUJII T , et al . A lightweight YOLOv2: a binarized CNN with A parallel support vector regression for an FPGA [C]. Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. February 25 - 27 , 2018, Monterey, CALIFORNIA, USA. New York : ACM , 2018: 31 - 40 . doi: 10.1145/3174243.3174266 http://dx.doi.org/10.1145/3174243.3174266
LI Z G , WANG J T . An improved algorithm for deep learning YOLO network based on Xilinx ZYNQ FPGA [C]. 2020 International Conference on Culture-oriented Science & Technology (ICCST). 2831,2020 , Beijing, China. IEEE , 2020 : 447 - 451 . doi: 10.1109/iccst50977.2020.00092 http://dx.doi.org/10.1109/iccst50977.2020.00092
NGUYEN D T , NGUYEN T N , KIM H , et al . A high-throughput and power-efficient FPGA implementation of YOLO CNN for object detection [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems , 2019 , 27 ( 8 ): 1861 - 1873 . doi: 10.1109/tvlsi.2019.2905242 http://dx.doi.org/10.1109/tvlsi.2019.2905242
ADIONO T , PUTRA A , SUTISNA N , et al . Low latency YOLOv3-tiny accelerator for low-cost FPGA using general matrix multiplication principle [C]. IEEE Access . 15,2021 , IEEE , 2021 : 141890 - 141913 . doi: 10.1109/access.2021.3120629 http://dx.doi.org/10.1109/access.2021.3120629
OH S , YOU J H , KIM Y K . Implementation of compressed YOLOv3-tiny on FPGA-SoC [C]. 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia). 13,2020 , Seoul, Korea (South). IEEE , 2020 : 1 - 4 . doi: 10.1109/icce-asia49877.2020.9277266 http://dx.doi.org/10.1109/icce-asia49877.2020.9277266
REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: unified, real-time object detection [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2730,2016 , Las Vegas, NV, USA. IEEE , 2016 : 779 - 788 . doi: 10.1109/cvpr.2016.91 http://dx.doi.org/10.1109/cvpr.2016.91
REDMON J , FARHADI A . YOLO9000: better, faster, stronger [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2126,2017 , Honolulu, HI, USA. IEEE , 2017 : 6517 - 6525 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690
REDMON J , FARHADI A . YOLOv3: an incremental improvement [EB/OL]. 2018 : arXiv : 1804 . 02767 . https://arxiv.org/abs/1804.02767 https://arxiv.org/abs/1804.02767 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690
DPUCZDX 8 G for zynq ultraScale+ MpSoCs product guide PG 338 (v 3 . 4 )[EB/OL]. Xilinx , [ 2022-01-20 ]. https://docs.xilinx.com/r/en-US/pg338-dpu?tocId=3xsG16y_QFTWvAJKHbisEw https://docs.xilinx.com/r/en-US/pg338-dpu?tocId=3xsG16y_QFTWvAJKHbisEw
ZHANG H B , JIANG J Q , FU Y H , et al . Yolov3-tiny object detection SoC based on FPGA platform [C]. 2021 6th International Conference on Integrated Circuits and Microsystems (ICICM). 2224,2021 , Nanjing, China. IEEE , 2021 : 291 - 294 . doi: 10.1109/icicm54364.2021.9660358 http://dx.doi.org/10.1109/icicm54364.2021.9660358
ZHANG S , CAO J , ZHANG Q , et al . An FPGA-based reconfigurable CNN accelerator for YOLO [C]. 2020 IEEE 3rd International Conference on Electronics Technology (ICET) . IEEE , 2020 : 74 - 78 . doi: 10.1109/icet49382.2020.9119500 http://dx.doi.org/10.1109/icet49382.2020.9119500
0
浏览量
1159
下载量
2
CSCD
关联资源
相关文章
相关作者
相关机构