FPGA-based hardware acceleration for CNNs developed using high-Level synthesis

Chu-liang WEI; Ru-lin CHEN; Qian GAO; Zheng-long SUN

doi:10.3788/OPE.20202805.1212

您当前的位置：

首页 >

文章列表页 >

FPGA-based hardware acceleration for CNNs developed using high-Level synthesis

Information Sciences | 更新时间：2020-07-09

- FPGA-based hardware acceleration for CNNs developed using high-Level synthesis
- Optics and Precision Engineering Vol. 28, Issue 5, Pages: 1212-1219(2020)
- 作者机构：
  
  1.汕头大学电子工程系, 广东汕头 515063
  2.深圳市人工智能与机器人研究院, 广东深圳 518054
  3.香港中文大学(深圳) 理工学院, 广东深圳 518172
- 作者简介：
  
  [ "魏楚亮(1979-)，男，广东揭阳人，博士，副教授，2006年于英国利物浦大学获得博士学位，主要从事人工智能、机器人、传感技术、智能工业控制、交通运输安全检测、FPGA设计等的研究。E-mail: clwei@stu.edu.cn" ]
  陈儒林(1998-)，男，广东茂名人，主要从事机器学习、FPGA设计等的研究。E-mail: 16rlchen@stu.edu.cn CHEN Ru-lin, E-mail: 16rlchen@stu.edu.cn
- 基金信息：
- DOI：10.3788/OPE.20202805.1212
  CLC： TP18;TP391.4
- Received：10 December 2019，
  
  Revised：03 February 2020，
  
  Accepted：03 February 2020，
  
  Published：25 May 2020
- 稿件说明：
移动端阅览
Chu-liang WEI, Ru-lin CHEN, Qian GAO, et al. FPGA-based hardware acceleration for CNNs developed using high-Level synthesis[J]. Optics and precision engineering, 2020, 28(5): 1212-1219.
DOI：

Chu-liang WEI, Ru-lin CHEN, Qian GAO, et al. FPGA-based hardware acceleration for CNNs developed using high-Level synthesis[J]. Optics and precision engineering, 2020, 28(5): 1212-1219. DOI： 10.3788/OPE.20202805.1212.

摘要

为了解决神经网络前向传播过程中的硬件加速问题，设计了一套基于FPGA编程工具Vivado HLS开发的AlexNet神经网络前向传播硬件加速系统。该系统能够确保在达到相关应用要求的基础上，有效地节省开发时间并降低开发成本。系统基于高级计算机语言C++进行FPGA电路的仿真与开发，同时，灵活运用具有很高便捷性及可靠性的Vivado HLS中的PIPELINE和ARRAY_PARTITION指令进行系统优化。实验结果表明，AlexNet神经网络在本文所构建的FPGA加速系统上的运行时间为21.95 ms，比在传统GPU平台上的运行时70 ms少，运行速度要3倍以上。此外，每一层的网络都实现了分开封装操作，使系统可便捷地移植到其它成熟的卷积神经网络上，加速了深度学习在各类人工智能系统上的应用，在智能产业具有广泛的应用价值。

Abstract

To accelerate the forward-propagation process of deep-learning networks

a field-programmable gate array (FPGA) hardware-acceleration system for AlexNet was developed using Vivado High-Level Synthesis (HLS)

which can greatly reduce the FPGA development cost. Using Vivado HLS

developers can design hardware architectures on an FPGA platform using C/C++ code instead of a hardware-description language. We implemented AlexNet on an FPGA platform using the HLS tool

and then used the PIPELINE and ARRAY_PARTITION directives to optimize the proposed system. An evaluation of the proposed system shows that its performance is three times better than a traditional computing-platform graphics processing unit (GPU). In the future

owing to the high-level encapsulation

the developed system can be easily transformed into other convolutional neural networks for practical operation

which shows its great portability and practical application value.

关键词

Keywords

references

OZA P, PATEL V M. Deep CNN-based Multi-task Learning for Open-Set Recognition[EB/OL]. 2019.

ZHU Y K, URTASUN R, SALAKHUTDINOV R, et al . segDeepM: Exploiting segmentation and context in deep neural networks for object detection[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7-12 June 2015, Boston, MA, USA. IEEE , 2015: 4703-4711.

GIRSHICK R, DONAHUE J, DARRELL T, et al . Rich feature hierarchies for accurate object detection and semantic segmentation[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, 23-28 June 2014, Columbus, OH, USA. IEEE , 2014: 580-587.

D ZHANG . A novel in-loop filtering mechanism of HEVC based on 3D sub-bands and CNN processing . Signal, Image and Video Processing , 2019 . 1 - 9 . http://cn.bing.com/academic/profile?id=de7fbb66c3af77a22f59a8c0305850ea&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=de7fbb66c3af77a22f59a8c0305850ea&encoded=0&v=paper_preview&mkt=zh-cn .

H YE , G Y LI , B H JUANG . Power of deep learning for channel estimation and signal detection in OFDM systems . IEEE Wireless Communications Letters , 2018 . 7 ( 1 ): 114 - 117 . http://cn.bing.com/academic/profile?id=c5b0bb6adf4ec51f581a49298f347364&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=c5b0bb6adf4ec51f581a49298f347364&encoded=0&v=paper_preview&mkt=zh-cn .

LIANG F, ZHANG C. Hardware oriented vision system of logistics robotics[C]. 2018 12th IEEE International Conference on Anti-Counterfeiting, Security, and Identification (ASID), 9-11 Nov. 2018, Xiamen, China. IEEE , 2018: 6-9.

X GAO , T ZHANG . Unsupervised learning to detect loops using deep neural networks for visual SLAM system . Autonomous Robots , 2017 . 41 ( 1 ): 1 - 18 . DOI: 10.1007/s10514-015-9516-2 http://doi.org/10.1007/s10514-015-9516-2 .

SELVIN S, VINAYAKUMAR R, GOPALAKRISHNAN E A, et al . Stock price prediction using LSTM, RNN and CNN-sliding window model[C]. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 13-16 Sept. 2017, Udupi, India. IEEE , 2017: 1643-1647.

LI Q, CAI W D, WANG X G, et al . Medical image classification with convolutional neural network[C]. 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV), 10-12 Dec. 2014, Singapore, Singapore. IEEE , 2014: 844-848.

POTLURI S, FASIH A, VUTUKURU L K, et al . CNN based high performance computing for real time image processing on GPU[C]. Proceedings of the Joint INDS'11 & ISTET'11, 25-27 July 2011, Klagenfurt, Austria. IEEE , 2011: 1-7.

STRIGL D, KOFLER K, PODLIPNIG S. Performance and scalability of GPU-based convolutional neural networks[C]. 2010 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 17-19 Feb. 2010, Pisa, Italy. IEEE , 2010: 317-324.

LEE S, SON K, KIM H, et al . Car plate recognition based on CNN using embedded system with GPU[C]. 2017 10th International Conference on Human System Interactions (HSI), 17-19 July 2017, Ulsan, South Korea. IEEE , 2017: 239-241.

ZHANG K, ZUO W M, GU S H, et al . Learning deep CNN denoiser prior for image restoration[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017, Honolulu, HI, USA. IEEE , 2017: 2808-2817.

K PAUWELS , M TOMASI , ALONSO J DIAZ , 等 . A comparison of FPGA and GPU for real-time phase-based optical flow, stereo, and local image features . IEEE Transactions on Computers , 2012 . 61 ( 7 ): 999 - 1012 . DOI: 10.1109/TC.2011.120 http://doi.org/10.1109/TC.2011.120 .

WANG X F, SOTIRIOS G Z. Hera: A reconfigurable and mixed-mode parallel computing engine on platform FPGAs[C]. 16th International Conference on Parallel and Distributed Computing and Systems (PDCS ). 2004.

ZHANG C. Optimizing FPGA-based accelerator design for deep convolutional neural networks[C]. Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM , 2015.

T FEIST . Vivado design suite . White Paper , 2012 . 5 30 http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0226950047/ http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0226950047/ .

C L WEI , R L CHEN , Q XIN . FPGA design of real-time MDFD system using high level synthesis . IEEE Access , 2019 . 7 83664 - 83672 . DOI: 10.1109/ACCESS.2019.2924330 http://doi.org/10.1109/ACCESS.2019.2924330 .

ALMISREB A A, JAMIL N, DIN N M. Utilizing AlexNet deep transfer learning for ear recognition[C]. 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP), 26-28 March 2018, Kota Kinabalu, Malaysia. IEEE , 2018: 1-5.

S Y LU , Z H LU , Y D ZHANG . Pathological brain detection based on AlexNet and transfer learning . Journal of Computational Science , 2019 . 30 41 - 47 . DOI: 10.1016/j.jocs.2018.11.008 http://doi.org/10.1016/j.jocs.2018.11.008 .

WAJAHAT N. Classification of breast cancer histology images using ALEXNET[C]. International Conference Image Analysis and Recognition. Springer, Cham , 2018.

Views

406

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

No data

Related Author

No data

Related Institution

No data

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰