Monocular vehicle pose estimation based on 3D model

Ling-zhi XU; Qin-wei FU; Wei TAO; Hui ZHAO

doi:10.37188/OPE.20212906.1346

您当前的位置：

首页 >

文章列表页 >

Monocular vehicle pose estimation based on 3D model

Modern Applied Optics | 更新时间：2021-07-03

- Monocular vehicle pose estimation based on 3D model
- Optics and Precision Engineering Vol. 29, Issue 6, Pages: 1346-1355(2021)
- 作者机构：
  
  上海交通大学仪器科学与工程系，上海200240
- 作者简介：
  
  E-mail： huizhao@sjtu.edu.cn
- 基金信息：
- DOI：10.37188/OPE.20212906.1346
  CLC： TP183;TP277
- Received：10 August 2020，
  
  Revised：01 September 2020，
  
  Published：15 June 2021
- 稿件说明：
移动端阅览
许凌志,符钦伟,陶卫等.基于三维模型的单目车辆位姿估计[J].光学精密工程,2021,29(06):1346-1355.

XU Ling-zhi,FU Qin-wei,TAO Wei,et al.Monocular vehicle pose estimation based on 3D model[J].Optics and Precision Engineering,2021,29(06):1346-1355.
许凌志,符钦伟,陶卫等.基于三维模型的单目车辆位姿估计[J].光学精密工程,2021,29(06):1346-1355. DOI： 10.37188/OPE.20212906.1346.

XU Ling-zhi,FU Qin-wei,TAO Wei,et al.Monocular vehicle pose estimation based on 3D model[J].Optics and Precision Engineering,2021,29(06):1346-1355. DOI： 10.37188/OPE.20212906.1346.

摘要

车辆位姿估计是智慧交通系统的重要组成部分，然而复杂的运动场景以及单目相机存在无法获取深度信息等问题。本文提出了一种结合单目相机及车辆三维模型进行车辆位姿估计的方法。首先对多尺度的车辆目标进行尺度归一化，然后以向量场的形式回归车辆关键点的坐标提升在遮挡或者截断状态下的位姿估计精度。在此过程中提出使用基于距离加权的向量场损失函数和关键点误差最小化的投票方法，进一步提高了位姿估计算法的准确性。此外，本文制作了一个含有丰富标注信息的合成车辆位姿估计数据集，在其上的验证结果表明，本文算法的平均定位误差和角度误差分别为0.162 m和4.692°，在实际场景中有着非常大的应用价值。

Abstract

Vehicle pose estimation is an important component of intelligent transportation systems. However， the complex scenes and loss of depth information are challenging problems in the estimation. This paper proposes a method that combines monocular pose estimation and a 3D vehicle model to estimate vehicle pose. First， a multi-scale vehicle are normalized， and then the coordinates of key points are predicted in the form of a vector field to increase the accuracy of the pose estimation for the truncated and occluded vehicle. Furthermore， a distance-based loss function for the vector field and key point error minimization voting method is established to further improve the accuracy of the pose estimation algorithm. In addition， we propose a synthetic vehicle pose estimation dataset with rich annotation information. The verification results show that the average position and angle errors of our algorithm are 0.162 m and 4.692°， respectively. Our method provides significant improvements over existing methods and has considerable practical application value.

关键词

Keywords

references

ZHOU Y ， SUN P ， ZHANG Y ， et al . End-to-end multi-view fusion for 3D object detection in LiDAR point clouds ［EB/OL］. 2019： arXiv ： 1910 .

06528［ cs . CV］ . https：//arxiv.org/abs/1910.06528 https://arxiv.org/abs/1910.06528

BELTRÁN J ， GUINDEL C ， MORENO F M ， et al . BirdNet： a 3D object detection framework from LiDAR information ［C］. 2018 21st International Conference on Intelligent Transportation Systems （ITSC） . 47，2018 ， Maui ， HI， USA . IEEE ， 2018 ： 3517 - 3523 .

ALI W ， ABDELKARIM S ， ZAHRAN M ， et al . YOLO3D： end-to-end real-time 3D oriented object bounding box detection from LiDAR point cloud ［EB/OL］. 2018： arXiv ： 1808 .

02350［ cs . CV］ . https：//arxiv.org/abs/1808.02350 https://arxiv.org/abs/1808.02350

WU D ， ZHUANG Z ， XIANG C ， et al . 6D-VNet： End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images ［C］. Proceedings of the Computer Vision and Pattern Recognition Workshops， Long Beach ， United States： CVPR ， 2019 ： 0 - 0 .

KEHL W ， MANHARDT F ， TOMBARI F ， et al . SSD-6D： making RGB-based 3D detection and 6D pose estimation great again ［C］. 2017 IEEE International Conference on Computer Vision （ICCV） . 2229，2017 ， Venice， Italy . IEEE ， 2017 ： 1530 - 1538 .

CHEN Y J ， TAI L ， SUN K ， et al . MonoPair： monocular 3D object detection using pairwise spatial relationships ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR） . 1319，2020 ， Seattle ， WA ， USA . IEEE ， 2020 ： 12090 - 12099 .

HINTERSTOISSER S ， HOLZER S ， CAGNIART C ， et al . Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes ［C］. 2011 International Conference on Computer Vision . 613，2011 ， Barcelona， Spain . IEEE ， 2011 ： 858 - 865 .

逯睿琦，马惠敏 . 多尺度显著性区域提取的模板匹配［J］. 光学精密工程， 2018 ， 26 （ 11 ）： 2776 - 2784 .

LU R Q ， MA H M . Template matching with multi-scale saliency ［J］. Opt. Precision Eng. ， 2018 ， 26 （ 11 ）： 2776 - 2784 . （in Chinese）

LI Q ， HU R ， CHEN Y ， et al . Vehicle Pose Estimation Using Mask Matching ［C］. 2019 IEEE International Conference on Acoustics， Speech and Signal Processing ， Brighton ， UK： ICASSP ， 2019 ： 1972 - 1976 .

BAROWSKI T ， SZCZOT M ， HOUBEN S . 6DoF vehicle pose estimation using segmentation-based part correspondences ［C］. 2019 IEEE Intelligent Transportation Systems Conference （ITSC） . 2730，2019 ， Auckland ， New Zealand. IEEE ， 2019 ： 573 - 580 .

CHABOT F ， CHAOUCH M ， RABARISOA J ， et al . Deep MANTA： a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . 2126，2017 ， Honolulu， HI， USA . IEEE ， 2017 ： 1827 - 1836 .

TEKIN B ， SINHA S N ， FUA P . Real-time seamless single shot 6D object pose prediction ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 292 - 301 .

屈也频，侯旺 . 基于误差传播理论的PnP问题姿态精度分析［J］. 光学精密工程， 2019 ， 27 （ 2 ）： 479 - 487 .

QU Y P ， HOU W . Attitude accuracy analysis of PnP based on error propagation theory ［J］. Opt. Precision Eng. ， 2019 ， 27 （ 2 ）： 479 - 487 . （in Chinese）

范丽丽，赵宏伟，赵浩宇，等 . 基于深度卷积神经网络的目标检测研究综述［J］. 光学精密工程， 2020 ， 28 （ 5 ）： 1152 - 1164 .

FAN L L ， ZHAO H W ， ZHAO H Y ， et al . Survey of target detection based on deep convolutional neural networks ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 5 ）： 1152 - 1164 . （in Chinese）

CAO Z ， SIMON T ， WEI S H ， et al . Realtime multi-person 2D pose estimation using part affinity fields ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . 2126，2017 ， Honolulu， HI， USA . IEEE ， 2017 ： 1302 - 1310 .

XIANG Y ， SCHMIDT T ， NARAYANAN V ， et al . PoseCNN： a convolutional neural network for 6D object pose estimation in cluttered scenes ［C］. Robotics ： Science and Systems XIV. Robotics ： Science and Systems Foundation ， 2018 ： 176 - 185 .

PENG S D ， LIU Y ， HUANG Q X ， et al . PVNet： pixel-wise voting network for 6DoF pose estimation ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR） . 1520，2019 ， Long Beach， CA， USA . IEEE ， 2019 ： 4556 - 4565 .

RONNEBERGER O ， FISCHER P ， BROX T . U-net： convolutional networks for biomedical image segmentation ［C］. Medical Image Computing and Computer-Assisted Intervention-MICCAI ， 2015 ， 2015 ： 234 - 241 .

LEPETIT V ， MORENO-NOGUER F ， FUA P . EPnP： an accurate O（n） solution to the PnP problem ［J］. International Journal of Computer Vision ， 2008 ， 81 （ 2 ）： 155 - 166 .

GEIGER A ， LENZ P ， STILLER C ， et al . Vision meets robotics： The KITTI dataset ［J］. The International Journal of Robotics Research ， 2013 ， 32 （ 11 ）： 1231 - 1237 .

GÄHLERT N ， JOURDAN N ， CORDTS M ， et al . Cityscapes 3 D： dataset and benchmark for 9 DoF vehicle detection［EB/OL］. arXiv preprint arXiv ， 2020 ： 2006 . 07864 .

CHANG A X ， FUNKHOUSER T ， GUIBAS L ， et al . ShapeNet： an information-rich 3D model repository ［EB/OL］. 2015： arXiv ： 1512 .

03012［ cs . GR］ . https：//arxiv.org/abs/1512.03012 https://arxiv.org/abs/1512.03012

ZHOU B L ， LAPEDRIZA A ， KHOSLA A ， et al . Places： a 10 million image database for scene recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2018 ， 40 （ 6 ）： 1452 - 1464 .

REDMON J ， FARHADI A . YOLOv3： an incremental improvement ［EB/OL］. arXiv preprint arXiv ， 2018 ： 1804 . 02767 .

HE K M ， ZHANG X Y ， REN S Q ， et al . Deep residual learning for image recognition ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . 2730，2016 ， Las Vegas， NV， USA . IEEE ， 2016 ： 770 - 778 .

XIANG Y ， CHOI W ， LIN Y Q ， et al . Data-driven 3D Voxel Patterns for object category recognition ［C］. 2015 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . 712，2015 ， Boston， MA， USA . IEEE ， 2015 ： 1903 - 1911 .

Views

1035

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Research on measurement technology of rocket recovery height based on monocular vision

Large range automatic attitude measurement method for laser tracking measurement

Automatic attitude measurement of laser tracker based on deep learning and PnP model

Three-dimensional porous structure reconstruction for low-resolution monocular endoscopic images

Driving obstacles prediction network merged with spatial attention

Related Author

GUAN Lei

MEI Chao

ZHANG Zhe

ZHANG Haifeng

GUO Huinan

CHEN Weining

CAO Jianzhong

ZHANG Gaopeng

Related Institution

Xi 'an Institute of Optics and Fine Mechanics, Chinese Academy of Sciences,Xi 'an

University of Chinese Academy of Sciences

College of Automation Engineering， Nanjing University of Aeronautics and Astronautics

Institute of Microelectronics， Chinese Academy of Sciences

Institute of Microelectronics of the Chinese Academy of Sciences

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰