浏览全部资源
扫码关注微信
1.长春工业大学 应用技术学院, 吉林 长春 130000
2.中国科学院 长春光学精密机械与物理研究所, 吉林 长春 130031
[ "林金花(1980-), 女, 吉林长春人, 博士, 讲师, 2004年、2008年于西安交通大学分别获得学士、硕士学位, 2017年于中国科学院长春光机所获得博士学位, 主要从事数字图像处理与目标识别方面的研究。E-mail:ljh3832@163.com" ]
[ "王延杰(1963-), 男, 吉林长春人, 研究员, 博士生导师, 1988年于吉林工业大学获得学士学位, 1998年于中国科学院长春光机所获得硕士学位, 主要从事数字图像处理, 信息处理, 自动目标识别等方面的研究。E-mail:wangyj@ciomp.ac.cn" ]
收稿日期:2017-10-10,
录用日期:2017-11-6,
纸质出版日期:2018-05-25
移动端阅览
林金花, 王延杰. 三维语义场景复原网络[J]. 光学 精密工程, 2018,26(5):1231-1241.
Jin-hua LIN, Yan-jie WANG. Three-dimentional reconstruction of semantic scene based on RGB-D map[J]. Optics and precision engineering, 2018, 26(5): 1231-1241.
林金花, 王延杰. 三维语义场景复原网络[J]. 光学 精密工程, 2018,26(5):1231-1241. DOI: 10.3788/OPE.20182605.1231.
Jin-hua LIN, Yan-jie WANG. Three-dimentional reconstruction of semantic scene based on RGB-D map[J]. Optics and precision engineering, 2018, 26(5): 1231-1241. DOI: 10.3788/OPE.20182605.1231.
从不完整的视觉信息中推断出物体的三维几何形状是机器视觉系统应当具备的重要能力,而识别出场景中物体的语义是机器视觉系统的核心。传统方法通常将二者分离实现,本文将场景复原与目标语义紧密结合,提出了一种三维语义场景复原网络模型,仅以单一深度图作为输入,实现对三维场景的语义分类和场景复原。首先,建立一种端到端的三维卷积神经网络,网络的输入是深度图,使用三维上下文模块来对相机视锥体内的区域进行学习,进而输出带有语义标签的三维体素;其次,建立了带有密集体积标签的合成三维场景数据集,用于训练本文的深度学习网络模型;最后通过实验表明,与现有的语义分类和场景复原方法相比,语义场景的复原接收区域增加了2.0%。结果表明:三维学习网络的复原性能良好,语义标注的准确率较高。
Reconstruction of 3D object is an important part in machine vision system
and the semantic understanding of 3D object is a core function for the machine vision system. In this paper
3D restoration was combined with the semantic understanding of 3D object
a 3D semantic scene recovery network was proposed. The semantic classification and scene restoration of 3D scene were achieved only by using a single RGB-D map as input. Firstly
an end-to-end 3D convolution neural network was established. The input of the network was a depth map. The 3D context module was used for learning the region within the camera view
then the 3D voxels with semantic labels were generated. Secondly
a synthetic data set with dense volume labels was established to train the depth learning network. Finally
the experimental results showed that the recovery performance w improved by 2.0% compared with the state-of-art. It can be seen that the 3D learning network plays well in 3D scene restoration
it owns high accuracy in semantic annotation of object in the scene.
GUPTA S, ARBELÁEZ P, MALIK J. Perceptual organization and recognition of indoor scenes from RGB-D images[C]. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition , IEEE, 2013: 564-571.
REN X F, BO L F, FOX D. RGB-(D) scene labeling: features and algorithms[C]. Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition , IEEE, 2012: 2759-2766.
SILBERMAN N, HOIEM D, KOHLI P, et al .. Indoor segmentation and support inference from RGBD images[C]. Proceedings of the 12 th European Conference on Computer Vision , Springer, 2012: 746-760.
LAI K, BO L F, FOX D. Unsupervised feature learning for 3D scene labeling[C]. Proceedings of 2014 IEEE International Conference on Robotics and Automation , IEEE, 2014: 3050-3057.
ROCK J, GUPTA T, THORSEN J, et al .. Completing 3D object shape from one depth image[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition , IEEE, 2015: 2484-2493.
MONSZPART A, MELLADO N, BROSTOW G J, et al .. RAPter:rebuilding man-made scenes with regular arrangements of planes[J]. ACM Transactions on Graphics, 2015, 34(4):103.
FIRMAN M, AODHA O M, JULIER S, et al .. Structured prediction of unobserved voxels from a single depth image[C]. Proceedings of 2016 IEEE Computer Vision and Pattern Recognition , IEEE, 2016: 5431-5440.
GUPTA S, ARBELÁEZ P, GIRSHICK R, et al .. Aligning 3D models to RGB-D images of cluttered scenes[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition , IEEE, 2015: 4731-4740.
SONG S R, XIAO J X. Sliding shapes for 3D object detection in depth images[C]. Proceedings of the 13 th European Conference on Computer Vision , Springer, 2014: 634-651.
GEIGER A, WANG CH H. Joint 3D object and layout inference from A single RGB-D image[M]//GALL J, GEHLER P, LEIBE B. Pattern Recognition . Cham: Springer, 2015: 183-195.
NAN L L, XIE K, SHARF A. A search - classify approach for cluttered indoor scene understanding[J]. ACM Transactions on Graphics, 2012, 31(6):137.
LIN D H, FIDLER S, URTASUN R. Holistic scene understanding for 3D object detection with RGBD cameras[C]. Proceedings of 2013 IEEE International Conference on Computer Vision , IEEE, 2013: 1417-1424.
SONG S, XIAO J. Deep sliding shapes for amodal 3D object detection in RGB-D images[J]. Computer Science, 2015, 139(2):808-816.
ZHENG B, ZHAO Y B, YU J C, et al .. Beyond point clouds: scene understanding by reasoning geometry and physics[C]. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, IEEE , 2013: 3127-3134.
KIM B S, KOHLI P, SAVARESE S. 3D scene understanding by voxel-CRF[C]. Proceedings of 2013 IEEE International Conference on Computer Vision , IEEE, 2013: 1425-1432.
HÄNE C, ZACH C, COHEN A, et al .. Joint 3D scene reconstruction and class segmentation[C]. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition , IEEE, 2013: 97-104.
BLÁHA M, VOGEL C, RICHARD A, et al .. Large-scale semantic 3D reconstruction: an adaptive multi-resolution model for multi-class volumetric labeling[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition , IEEE, 2016: 3176-3184.
HANDA A, PATRAUCEAN V, BADRINARAYANAN V, et al .. SceneNet:understanding real world indoor scenes with synthetic data[J]. Computer Science, 2015:4077-4085.
吕朝辉, 沈萦华, 李精华.基于Kinect的深度图像修复方法[J].吉林大学学报(工学版), 2016, 46(5):1697-1703.
LÜ CH H, SHEN Y H, LI J H. Depth map inpainting method based on Kinect sensor[J]. Journal of Jilin University (Engineering and Technology Edition), 2016, 46(5):1697-1703. (in Chinese)
刘迎, 王朝阳, 高楠, 等.特征提取的点云自适应精简[J].光学 精密工程, 2017, 25(1):245-254.
LIU Y, WANG CH Y, GAON, et al .. Point cloud adaptive simplification of feature extraction[J]. Opt. Precision Eng., 2017, 25(1):245-254. (in Chinese)
胡长胜, 詹曙, 吴从中.基于深度特征学习的图像超分辨率重建[J].自动化学报, 2017, 43(5):814-821.
HU CH SH, ZHAN SH, WU C ZH. Image super-resolution based on deep learning features[J]. Acta Automatica Sinica, 2017, 43(5):814-821. (in Chinese)
CHANG A X, FUNKHOUSER T, GUIBAS L, et al .. ShapeNet: an information-rich 3D model repository[J]. arXiv: 1512. 03012, 2015.
JIA Y Q, SHELHAMER E, DONAHUE J, et al .. Caffe: convolutional architecture for fast feature embedding[C]. Proceedings of the 22 nd ACM International Conference on Multimedia , ACM, 2014: 675-678.
NEWCOMBE R A, IZADI S, HILLIGES O, et al .. KinectFusion: real-time dense surface mapping and tracking[C]. Proceedings of the 10 th IEEE International Symposium on Mixed and Augmented Reality , IEEE, 2011: 127-136.
GUO R Q, ZOU CH H, HOIEM D. Predicting complete 3D models of indoor scenes[J]. arXiv: 1504. 02437, 2015.
蔡强, 郝佳云, 曹健, 等.结合局部特征及全局特征的显著性检测[J].光学 精密工程, 2017, 25(3):772-778.
CAI Q, HAO J Y, CAO J, et al .. Salient detection via local and global feature[J]. Opt. Precision Eng., 2017, 25(3):772-778. (in Chinese)
0
浏览量
424
下载量
3
CSCD
关联资源
相关文章
相关作者
相关机构