浏览全部资源
扫码关注微信
1.汕头大学 电子工程系, 广东 汕头 515063
2.深圳市人工智能与机器人研究院, 广东 深圳 518054
3.香港中文大学(深圳) 理工学院, 广东 深圳 518172
[ "魏楚亮(1979-),男,广东揭阳人,博士,副教授,2006年于英国利物浦大学获得博士学位,主要从事人工智能、机器人、传感技术、智能工业控制、交通运输安全检测、FPGA设计等的研究。E-mail: clwei@stu.edu.cn" ]
陈儒林(1998-),男,广东茂名人,主要从事机器学习、FPGA设计等的研究。E-mail: 16rlchen@stu.edu.cn CHEN Ru-lin, E-mail: 16rlchen@stu.edu.cn
Received:10 December 2019,
Revised:03 February 2020,
Accepted:03 February 2020,
Published:25 May 2020
移动端阅览
Chu-liang WEI, Ru-lin CHEN, Qian GAO, et al. FPGA-based hardware acceleration for CNNs developed using high-Level synthesis[J]. Optics and precision engineering, 2020, 28(5): 1212-1219.
Chu-liang WEI, Ru-lin CHEN, Qian GAO, et al. FPGA-based hardware acceleration for CNNs developed using high-Level synthesis[J]. Optics and precision engineering, 2020, 28(5): 1212-1219. DOI: 10.3788/OPE.20202805.1212.
为了解决神经网络前向传播过程中的硬件加速问题,设计了一套基于FPGA编程工具Vivado HLS开发的AlexNet神经网络前向传播硬件加速系统。该系统能够确保在达到相关应用要求的基础上,有效地节省开发时间并降低开发成本。系统基于高级计算机语言C++进行FPGA电路的仿真与开发,同时,灵活运用具有很高便捷性及可靠性的Vivado HLS中的PIPELINE和ARRAY_PARTITION指令进行系统优化。实验结果表明,AlexNet神经网络在本文所构建的FPGA加速系统上的运行时间为21.95 ms,比在传统GPU平台上的运行时70 ms少,运行速度要3倍以上。此外,每一层的网络都实现了分开封装操作,使系统可便捷地移植到其它成熟的卷积神经网络上,加速了深度学习在各类人工智能系统上的应用,在智能产业具有广泛的应用价值。
To accelerate the forward-propagation process of deep-learning networks
a field-programmable gate array (FPGA) hardware-acceleration system for AlexNet was developed using Vivado High-Level Synthesis (HLS)
which can greatly reduce the FPGA development cost. Using Vivado HLS
developers can design hardware architectures on an FPGA platform using C/C++ code instead of a hardware-description language. We implemented AlexNet on an FPGA platform using the HLS tool
and then used the PIPELINE and ARRAY_PARTITION directives to optimize the proposed system. An evaluation of the proposed system shows that its performance is three times better than a traditional computing-platform graphics processing unit (GPU). In the future
owing to the high-level encapsulation
the developed system can be easily transformed into other convolutional neural networks for practical operation
which shows its great portability and practical application value.
OZA P, PATEL V M. Deep CNN-based Multi-task Learning for Open-Set Recognition[EB/OL]. 2019.
ZHU Y K, URTASUN R, SALAKHUTDINOV R, et al . segDeepM: Exploiting segmentation and context in deep neural networks for object detection[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7-12 June 2015, Boston, MA, USA. IEEE , 2015: 4703-4711.
GIRSHICK R, DONAHUE J, DARRELL T, et al . Rich feature hierarchies for accurate object detection and semantic segmentation[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, 23-28 June 2014, Columbus, OH, USA. IEEE , 2014: 580-587.
D ZHANG . A novel in-loop filtering mechanism of HEVC based on 3D sub-bands and CNN processing . Signal, Image and Video Processing , 2019 . 1 - 9 . http://cn.bing.com/academic/profile?id=de7fbb66c3af77a22f59a8c0305850ea&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=de7fbb66c3af77a22f59a8c0305850ea&encoded=0&v=paper_preview&mkt=zh-cn .
H YE , G Y LI , B H JUANG . Power of deep learning for channel estimation and signal detection in OFDM systems . IEEE Wireless Communications Letters , 2018 . 7 ( 1 ): 114 - 117 . http://cn.bing.com/academic/profile?id=c5b0bb6adf4ec51f581a49298f347364&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=c5b0bb6adf4ec51f581a49298f347364&encoded=0&v=paper_preview&mkt=zh-cn .
LIANG F, ZHANG C. Hardware oriented vision system of logistics robotics[C]. 2018 12th IEEE International Conference on Anti-Counterfeiting, Security, and Identification (ASID), 9-11 Nov. 2018, Xiamen, China. IEEE , 2018: 6-9.
X GAO , T ZHANG . Unsupervised learning to detect loops using deep neural networks for visual SLAM system . Autonomous Robots , 2017 . 41 ( 1 ): 1 - 18 . DOI: 10.1007/s10514-015-9516-2 http://doi.org/10.1007/s10514-015-9516-2 .
SELVIN S, VINAYAKUMAR R, GOPALAKRISHNAN E A, et al . Stock price prediction using LSTM, RNN and CNN-sliding window model[C]. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 13-16 Sept. 2017, Udupi, India. IEEE , 2017: 1643-1647.
LI Q, CAI W D, WANG X G, et al . Medical image classification with convolutional neural network[C]. 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV), 10-12 Dec. 2014, Singapore, Singapore. IEEE , 2014: 844-848.
POTLURI S, FASIH A, VUTUKURU L K, et al . CNN based high performance computing for real time image processing on GPU[C]. Proceedings of the Joint INDS'11 & ISTET'11, 25-27 July 2011, Klagenfurt, Austria. IEEE , 2011: 1-7.
STRIGL D, KOFLER K, PODLIPNIG S. Performance and scalability of GPU-based convolutional neural networks[C]. 2010 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 17-19 Feb. 2010, Pisa, Italy. IEEE , 2010: 317-324.
LEE S, SON K, KIM H, et al . Car plate recognition based on CNN using embedded system with GPU[C]. 2017 10th International Conference on Human System Interactions (HSI), 17-19 July 2017, Ulsan, South Korea. IEEE , 2017: 239-241.
ZHANG K, ZUO W M, GU S H, et al . Learning deep CNN denoiser prior for image restoration[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017, Honolulu, HI, USA. IEEE , 2017: 2808-2817.
K PAUWELS , M TOMASI , ALONSO J DIAZ , 等 . A comparison of FPGA and GPU for real-time phase-based optical flow, stereo, and local image features . IEEE Transactions on Computers , 2012 . 61 ( 7 ): 999 - 1012 . DOI: 10.1109/TC.2011.120 http://doi.org/10.1109/TC.2011.120 .
WANG X F, SOTIRIOS G Z. Hera: A reconfigurable and mixed-mode parallel computing engine on platform FPGAs[C]. 16th International Conference on Parallel and Distributed Computing and Systems (PDCS ). 2004.
ZHANG C. Optimizing FPGA-based accelerator design for deep convolutional neural networks[C]. Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM , 2015.
T FEIST . Vivado design suite . White Paper , 2012 . 5 30 http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0226950047/ http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0226950047/ .
C L WEI , R L CHEN , Q XIN . FPGA design of real-time MDFD system using high level synthesis . IEEE Access , 2019 . 7 83664 - 83672 . DOI: 10.1109/ACCESS.2019.2924330 http://doi.org/10.1109/ACCESS.2019.2924330 .
ALMISREB A A, JAMIL N, DIN N M. Utilizing AlexNet deep transfer learning for ear recognition[C]. 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP), 26-28 March 2018, Kota Kinabalu, Malaysia. IEEE , 2018: 1-5.
S Y LU , Z H LU , Y D ZHANG . Pathological brain detection based on AlexNet and transfer learning . Journal of Computational Science , 2019 . 30 41 - 47 . DOI: 10.1016/j.jocs.2018.11.008 http://doi.org/10.1016/j.jocs.2018.11.008 .
WAJAHAT N. Classification of breast cancer histology images using ALEXNET[C]. International Conference Image Analysis and Recognition. Springer, Cham , 2018.
0
Views
406
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution