浏览全部资源
扫码关注微信
上海交通大学 仪器科学与工程系,上海200240
E-mail: huizhao@sjtu.edu.cn
Received:10 August 2020,
Revised:01 September 2020,
Published:15 June 2021
移动端阅览
许凌志,符钦伟,陶卫等.基于三维模型的单目车辆位姿估计[J].光学精密工程,2021,29(06):1346-1355.
XU Ling-zhi,FU Qin-wei,TAO Wei,et al.Monocular vehicle pose estimation based on 3D model[J].Optics and Precision Engineering,2021,29(06):1346-1355.
许凌志,符钦伟,陶卫等.基于三维模型的单目车辆位姿估计[J].光学精密工程,2021,29(06):1346-1355. DOI: 10.37188/OPE.20212906.1346.
XU Ling-zhi,FU Qin-wei,TAO Wei,et al.Monocular vehicle pose estimation based on 3D model[J].Optics and Precision Engineering,2021,29(06):1346-1355. DOI: 10.37188/OPE.20212906.1346.
车辆位姿估计是智慧交通系统的重要组成部分,然而复杂的运动场景以及单目相机存在无法获取深度信息等问题。本文提出了一种结合单目相机及车辆三维模型进行车辆位姿估计的方法。首先对多尺度的车辆目标进行尺度归一化,然后以向量场的形式回归车辆关键点的坐标提升在遮挡或者截断状态下的位姿估计精度。在此过程中提出使用基于距离加权的向量场损失函数和关键点误差最小化的投票方法,进一步提高了位姿估计算法的准确性。此外,本文制作了一个含有丰富标注信息的合成车辆位姿估计数据集,在其上的验证结果表明,本文算法的平均定位误差和角度误差分别为0.162 m和4.692°,在实际场景中有着非常大的应用价值。
Vehicle pose estimation is an important component of intelligent transportation systems. However, the complex scenes and loss of depth information are challenging problems in the estimation. This paper proposes a method that combines monocular pose estimation and a 3D vehicle model to estimate vehicle pose. First, a multi-scale vehicle are normalized, and then the coordinates of key points are predicted in the form of a vector field to increase the accuracy of the pose estimation for the truncated and occluded vehicle. Furthermore, a distance-based loss function for the vector field and key point error minimization voting method is established to further improve the accuracy of the pose estimation algorithm. In addition, we propose a synthetic vehicle pose estimation dataset with rich annotation information. The verification results show that the average position and angle errors of our algorithm are 0.162 m and 4.692°, respectively. Our method provides significant improvements over existing methods and has considerable practical application value.
ZHOU Y , SUN P , ZHANG Y , et al . End-to-end multi-view fusion for 3D object detection in LiDAR point clouds [EB/OL]. 2019: arXiv : 1910 .
06528[ cs . CV] . https://arxiv.org/abs/1910.06528 https://arxiv.org/abs/1910.06528
BELTRÁN J , GUINDEL C , MORENO F M , et al . BirdNet: a 3D object detection framework from LiDAR information [C]. 2018 21st International Conference on Intelligent Transportation Systems (ITSC) . 47,2018 , Maui , HI, USA . IEEE , 2018 : 3517 - 3523 .
ALI W , ABDELKARIM S , ZAHRAN M , et al . YOLO3D: end-to-end real-time 3D oriented object bounding box detection from LiDAR point cloud [EB/OL]. 2018: arXiv : 1808 .
02350[ cs . CV] . https://arxiv.org/abs/1808.02350 https://arxiv.org/abs/1808.02350
WU D , ZHUANG Z , XIANG C , et al . 6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images [C]. Proceedings of the Computer Vision and Pattern Recognition Workshops, Long Beach , United States: CVPR , 2019 : 0 - 0 .
KEHL W , MANHARDT F , TOMBARI F , et al . SSD-6D: making RGB-based 3D detection and 6D pose estimation great again [C]. 2017 IEEE International Conference on Computer Vision (ICCV) . 2229,2017 , Venice, Italy . IEEE , 2017 : 1530 - 1538 .
CHEN Y J , TAI L , SUN K , et al . MonoPair: monocular 3D object detection using pairwise spatial relationships [C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . 1319,2020 , Seattle , WA , USA . IEEE , 2020 : 12090 - 12099 .
HINTERSTOISSER S , HOLZER S , CAGNIART C , et al . Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes [C]. 2011 International Conference on Computer Vision . 613,2011 , Barcelona, Spain . IEEE , 2011 : 858 - 865 .
逯睿琦 , 马惠敏 . 多尺度显著性区域提取的模板匹配 [J]. 光学 精密工程 , 2018 , 26 ( 11 ): 2776 - 2784 .
LU R Q , MA H M . Template matching with multi-scale saliency [J]. Opt. Precision Eng. , 2018 , 26 ( 11 ): 2776 - 2784 . (in Chinese)
LI Q , HU R , CHEN Y , et al . Vehicle Pose Estimation Using Mask Matching [C]. 2019 IEEE International Conference on Acoustics, Speech and Signal Processing , Brighton , UK: ICASSP , 2019 : 1972 - 1976 .
BAROWSKI T , SZCZOT M , HOUBEN S . 6DoF vehicle pose estimation using segmentation-based part correspondences [C]. 2019 IEEE Intelligent Transportation Systems Conference (ITSC) . 2730,2019 , Auckland , New Zealand. IEEE , 2019 : 573 - 580 .
CHABOT F , CHAOUCH M , RABARISOA J , et al . Deep MANTA: a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . 2126,2017 , Honolulu, HI, USA . IEEE , 2017 : 1827 - 1836 .
TEKIN B , SINHA S N , FUA P . Real-time seamless single shot 6D object pose prediction [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823,2018 , Salt Lake City, UT, USA . IEEE , 2018 : 292 - 301 .
屈也频 , 侯旺 . 基于误差传播理论的PnP问题姿态精度分析 [J]. 光学 精密工程 , 2019 , 27 ( 2 ): 479 - 487 .
QU Y P , HOU W . Attitude accuracy analysis of PnP based on error propagation theory [J]. Opt. Precision Eng. , 2019 , 27 ( 2 ): 479 - 487 . (in Chinese)
范丽丽 , 赵宏伟 , 赵浩宇 , 等 . 基于深度卷积神经网络的目标检测研究综述 [J]. 光学 精密工程 , 2020 , 28 ( 5 ): 1152 - 1164 .
FAN L L , ZHAO H W , ZHAO H Y , et al . Survey of target detection based on deep convolutional neural networks [J]. Opt. Precision Eng. , 2020 , 28 ( 5 ): 1152 - 1164 . (in Chinese)
CAO Z , SIMON T , WEI S H , et al . Realtime multi-person 2D pose estimation using part affinity fields [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . 2126,2017 , Honolulu, HI, USA . IEEE , 2017 : 1302 - 1310 .
XIANG Y , SCHMIDT T , NARAYANAN V , et al . PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes [C]. Robotics : Science and Systems XIV. Robotics : Science and Systems Foundation , 2018 : 176 - 185 .
PENG S D , LIU Y , HUANG Q X , et al . PVNet: pixel-wise voting network for 6DoF pose estimation [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . 1520,2019 , Long Beach, CA, USA . IEEE , 2019 : 4556 - 4565 .
RONNEBERGER O , FISCHER P , BROX T . U-net: convolutional networks for biomedical image segmentation [C]. Medical Image Computing and Computer-Assisted Intervention-MICCAI , 2015 , 2015 : 234 - 241 .
LEPETIT V , MORENO-NOGUER F , FUA P . EPnP: an accurate O(n) solution to the PnP problem [J]. International Journal of Computer Vision , 2008 , 81 ( 2 ): 155 - 166 .
GEIGER A , LENZ P , STILLER C , et al . Vision meets robotics: The KITTI dataset [J]. The International Journal of Robotics Research , 2013 , 32 ( 11 ): 1231 - 1237 .
GÄHLERT N , JOURDAN N , CORDTS M , et al . Cityscapes 3 D: dataset and benchmark for 9 DoF vehicle detection[EB/OL]. arXiv preprint arXiv , 2020 : 2006 . 07864 .
CHANG A X , FUNKHOUSER T , GUIBAS L , et al . ShapeNet: an information-rich 3D model repository [EB/OL]. 2015: arXiv : 1512 .
03012[ cs . GR] . https://arxiv.org/abs/1512.03012 https://arxiv.org/abs/1512.03012
ZHOU B L , LAPEDRIZA A , KHOSLA A , et al . Places: a 10 million image database for scene recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018 , 40 ( 6 ): 1452 - 1464 .
REDMON J , FARHADI A . YOLOv3: an incremental improvement [EB/OL]. arXiv preprint arXiv , 2018 : 1804 . 02767 .
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . 2730,2016 , Las Vegas, NV, USA . IEEE , 2016 : 770 - 778 .
XIANG Y , CHOI W , LIN Y Q , et al . Data-driven 3D Voxel Patterns for object category recognition [C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . 712,2015 , Boston, MA, USA . IEEE , 2015 : 1903 - 1911 .
0
Views
1035
下载量
4
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution