对齐特征表示的跨模态人脸识别

明悦; 王绍颖; 范春晓; 周江婉

doi:10.37188/OPE.20202810.2311

您当前的位置：

首页 >

文章列表页 >

对齐特征表示的跨模态人脸识别

信息科学 | 更新时间：2020-11-16

- 对齐特征表示的跨模态人脸识别
- Exploring aligned latent representations for cross-domain face recognition
- 光学精密工程 2020年28卷第10期页码：2311-2322
- 作者机构：
  
  北京邮电大学电子工程学院, 北京 100876
- 作者简介：
  
  [ "明悦(1984-), 女, 北京, 副教授, 博士生导师, 2006年于北京交通大学获得学士学位, 2008年于北京交通大学获得硕士学位, 2013年于北京交通大学获得博士学位, 主要从事模式识别与机器学习方面的研究。E-mail:myname35875235@126.com" ]
  [ "王绍颖(1994-), 女, 山东, 硕士研究生, 2017年于中国传媒大学获得学士学位, 2020年于北京邮电大学获得硕士学位, 主要从事人脸识别方面的研究。E-mail:13693378978@163.com" ]
- 基金信息：
  
  国家自然科学基金资助项目(62076030);北京市自然科学基金资助项目(L182033);中央高校基本科研业务费资助(2019PTB-001)
- DOI：10.37188/OPE.20202810.2311
  中图分类号： TP394.1;TH691.9
- 收稿日期：2020-07-09，
  
  修回日期：2020-07-30，
  
  录用日期：2020-7-30，
  
  纸质出版日期：2020-10-25
- 稿件说明：
移动端阅览
明悦, 王绍颖, 范春晓, 等. 对齐特征表示的跨模态人脸识别[J]. 光学精密工程, 2020,28(10):2311-2322.

Yue MING, Shao-Ying WANG, Chun-Xiao FAN, et al. Exploring aligned latent representations for cross-domain face recognition[J]. Optics and precision engineering, 2020, 28(10): 2311-2322.
明悦, 王绍颖, 范春晓, 等. 对齐特征表示的跨模态人脸识别[J]. 光学精密工程, 2020,28(10):2311-2322. DOI： 10.37188/OPE.20202810.2311.

Yue MING, Shao-Ying WANG, Chun-Xiao FAN, et al. Exploring aligned latent representations for cross-domain face recognition[J]. Optics and precision engineering, 2020, 28(10): 2311-2322. DOI： 10.37188/OPE.20202810.2311.

摘要

跨模态人脸识别一直是人脸识别领域的研究热点，在安防、刑侦等现实场景中具有极高的应用价值和发展潜力。现有的跨模态人脸识别算法通常在图像空间或潜在空间建立不同模态人脸的联系，却忽略了二者的内在关联性，容易导致跨模态信息的丢失。为解决这一问题，本文提出基于对齐特征表示的跨模态人脸识别算法（Cross-Domain Representation Alignment，CDRA）。CDRA算法在人脸图像空间和潜在空间、模态内和模态间探索不同模态人脸数据间的关联性：首先，为减少信息损失，CDRA算法通过对单一模态内人脸的重建，学习到包含判别信息的模态内潜在特征表示；然后，在图像空间，CDRA算法通过从不同模态的潜在特征表示中，跨模态地重建图像，以间接对齐不同模态的潜在特征表示，在潜在空间，CDRA算法通过对齐不同模态数据的潜在高斯分布直接对齐不同模态的潜在特征表示，促使特征表示学习到不同模态人脸在不同空间维度多个层次的跨模态信息。实验结果表明CDRA算法在Multi-Pie数据集上的人脸识别准确率的平均值为97.2%，在CASIA NIR-VIS 2.0数据集上的人脸识别准确率为99.4%±0.2%，同时实现了跨模态人脸数据的高效互生成。CDRA算法能够在图像空间和潜在子空间学习到更具判别能力的跨模态关联信息，有效地提高了跨模态人脸识别准确率。

Abstract

Cross-domain face recognition (FR) has always been a research hotspot in the field of face recognition. It has high application value and development potential in real applications such as security and criminal investigation. The existing cross-domain face recognition methods usually establish the correlation between different domain faces in the image space or latent subspace

but ignore the intrinsic relation between the two

which easily leads to the loss of inter-modal correlation information. In order to solve this problem

in this paper

we propose a novel method

called Cross-Domain Representation Alignment (CDRA). CDRA algorithm explores the correlation between different domain face data in the face image space and latent space. First

in order to reduce information loss

the CDRA algorithm can learn the latent feature representation containing discriminant information by reconstructing the face in a single domain. Then

in image space

CDRA algorithm is used to cross domain from different domain latent features. In the latent space

CDRA directly aligns the latent feature representations of different domain by aligning the latent Gaussian distribution of different domain data

which promotes the feature representation to learn the cross domain information of different domain faces in different spatial dimensions and levels. Experimental results indicate the average face recognition accuracy rate of CDRA is 97.2% on Multi-Pie dataset

and 99.4% ±0.2% on CASIA NIR-VIS 2.0 dataset. Simultaneously

the efficient cross-domain face synthesis is realized. The learned latent features of our CDRA method can obtain the essential cross-domain information in both image space and latent subspace for cross-domain FR task

which can effectively improve the cross-domain face recognition.

关键词

Keywords

references

CAO B, WANG N N, GAO X B, et al .. Multi-margin based decorrelation learning for heterogeneous face recognition[C]. International Joint Conference on Artificial Intelligence , 2019: 680-686.

张军 , 何昕 , 魏仲慧 , 等 . 基于多特征匹配的快速星图识别 . 光学精密工程 , 2019 . 27 ( 8 ): 1870 - 1879 . http://ope.lightpublishing.cn/thesisDetails?columnId=2095674&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=2095674&Fpath=&index=-1&l=zh .

J ZHANG , X HE , ZH H WEI , 等 . Fast star identification algorithm based on multi-feature matching . Opt. Precision Eng. , 2019 . 27 ( 4 ): 963 - 970 . http://ope.lightpublishing.cn/thesisDetails?columnId=2095674&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=2095674&Fpath=&index=-1&l=zh .

WU X, HUANG H B, PATEL V M, et al .. Disentangled variational representation for heterogeneous face recognition[C]. Thirty-third AAAI Conference on Artificial Intelligence , 2019: 9005-9012.

R HE , X WU , Z N SUN , 等 . Wasserstein CNN: Learning invariant features for nir-vis face recognition . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2018 . 41 ( 7 ): 1761 - 1773 . http://ieeexplore.ieee.org/document/8370677/ http://ieeexplore.ieee.org/document/8370677/ .

WU X, SONG L X, HE R, et al .. Coupled deep learning for heterogeneous face recognition[C]. Thirty-second AAAI Conference on Artificial Intelligence , 2018.

OLIVEIRA J S, SOUZA G B, ROCHA A R, et al .. Cross-domain deep face matching for real banking security systems[C]. arXiv preprint: 1806.07644, 2018.

HE R, CAO J, SONG L X, et al .. Cross-spectral face completion for nir-vis heterogeneous face recognition[C]. arXiv preprint: 1902.03565, 2019.

T ZHANG , H WANG , Q L DONG . Deep disentangling siamese network for frontal face synthesis under neutral illumination . IEEE Signal Processing Letters , 2018 . 25 ( 9 ): 1344 - 1348 . DOI: 10.1109/LSP.2018.2858558 http://doi.org/10.1109/LSP.2018.2858558 .

潘仙张 , 张石清 , 郭文平 . 多模深度卷积神经网络应用于视频表情识别 . 光学精密工程 , 2019 . 27 ( 4 ): 963 - 970 . http://ope.lightpublishing.cn/thesisDetails?columnId=1425472&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=1425472&Fpath=&index=-1&l=zh .

X ZH PAN , SH Q ZHANG , W P GUO . Video-based facial expression recognition using multimodal deep convolutional neural networks . Opt. Precision Eng. , 2019 . 27 ( 4 ): 963 - 970 . http://ope.lightpublishing.cn/thesisDetails?columnId=1425472&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=1425472&Fpath=&index=-1&l=zh .

GOODFELLOW I, JEAN P A, MIRZA M, et al .. Generative adversarial nets[C]. Conference and Workshop on Neural Information Processing Systems , 2014: 2672-2680.

KINGMA D P, WELLING M. Auto-encoding variational bayes[C]. International Conference on Machine Learning , 2013.

HUANG H B, HE R, SUN Z N, et al .. Introvae: Introspective variational autoencoders for photographic image synthesis[C]. Conference and Workshop on Neural Information Processing Systems , 2018: 52-63.

SUN H Z, XU W D, DENG C, et al .. Multi-digit image synthesis using recurrent conditional variational autoencoder[C]. International Joint Conference on Neural Networks , 2016: 375-380.

WANG W R, ARORA R, LIVERSCU K, et al .. On deep multi-view representation learning[C]. International Conference on Machine Learning , 2015: 1083-1092.

M N KAN , S G SHAN , H H ZHANG , 等 . Multi-view discriminant analysis . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 . 38 ( 1 ): 188 - 194 . http://ieeexplore.ieee.org/document/7110624 http://ieeexplore.ieee.org/document/7110624 .

N WANG , X GAO , L SUN , 等 . Bayesian face sketch synthesis . IEEE Transactions on Image Processing , 2017 . 26 ( 3 ): 1264 - 1274 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=d1478f128224b3cb9af81abff2413ea5 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=d1478f128224b3cb9af81abff2413ea5 , DOI: 10.1109/TIP.2017.2651375 http://doi.org/10.1109/TIP.2017.2651375 .

TRAN L, YIN X, LIU X M. Disentangled representation learning gan for pose-invariant face recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition , 2017: 1415-1424.

QIAN Y, DENG W, HU J. Unsupervised face normalization with extreme pose and expression in the wild[C]. IEEE Conference on Computer Vision and Pattern Recognition , 2019: 9851-9858.

M HE , J ZHANG , S SHAN , 等 . Deformable face net for pose invariant face recognition . Pattern Recognition , 2020 . 100 10711 .

范丽丽 , 赵宏伟 , 赵浩宇 , 等 . 基于卷积神经网络的目标检测研究综述 . 光学精密工程 , 2020 . 28 ( 5 ): 1153 - 1164 . http://ope.lightpublishing.cn/thesisDetails?columnId=2124482&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=2124482&Fpath=&index=-1&l=zh .

L L FAN , H W ZHAO , H Y ZHAO , 等 . Survey of target detection based on deep convolutional neural networks . Opt. Precision Eng. , 2020 . 28 ( 5 ): 1153 - 1164 . http://ope.lightpublishing.cn/thesisDetails?columnId=2124482&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=2124482&Fpath=&index=-1&l=zh .

HOU X X, SHEN L L, SUN K, et al .. Deep feature consistent variational autoencoder[C]. Winter Conference on Applications of Computer Vision , 2017: 1133-1141.

R GROSS , I MATTHEWS , J COHN , 等 . Multi-pie . Image and Vision Computing , 2010 . 28 ( 5 ): 807 - 813 . DOI: 10.1016/j.imavis.2009.08.002 http://doi.org/10.1016/j.imavis.2009.08.002 .

LI S, YI D, LEI Z, et al .. The casia nir-vis 2.0 face database[C]. IEEE Conference on Computer Vision and Pattern Recognition workshops , 2013: 348-353.

ZHU Z Y, LUO P, WANG X G, et al .. Deep learning identity-preserving face space[C]. IEEE Conference on Computer Vision and Pattern Recognition , 2013: 113-120.

ZHU Z Y, LUO P, WANG X G, et al .. Multi-view perceptron: a deep model for learning face identity and view representations[C]. Conference and Workshop on Neural Information Processing Systems , 2014: 217-225.

YIM J, JUNG H, YOO B, et al .. Rotating your face using multi-task deep neural network[C]. IEEE Conference on Computer Vision and Pattern Recognition , 2015: 676-684.

HU Y B, WU X, YU B, et al .. Pose-guided photorealistic face rotation[C]. IEEE Conference on Computer Vision and Pattern Recognition , 2018: 8398-8406.

X S HUANG , Z LEI , M Y FAN , 等 . Regularized discriminative spectral regression method for heterogeneous face matching . IEEE Transactions on Image Processing , 2012 . 22 ( 1 ): 353 - 362 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=e59efa1d17e9db9cf2d4f1bf926991ce http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=e59efa1d17e9db9cf2d4f1bf926991ce .

YI D, LEI Z, LI S Z. Shared representation learning for heterogenous face recognition[C]. IEEE international conference and workshops on automatic face and gesture recognition , 2015: 1-7.

REALE C, NASRABADI N M, KWON H, et al .. Seeing the forest from the trees: A holistic approach to near-infrared heterogeneous face recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition , 2016: 54-62.

SONG L, ZHANG M, WU X, et al .. Adversarial discriminative heterogeneous face recognition[C]. International Joint Conference on Artificial Intelligence , 2018.

任克强 , 胡慧 . 角度空间三元组损失微调的人脸识别 . 液晶与显示 , 2019 . 34 ( 1 ): 110 - 117 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs201901015 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs201901015 .

K Q REN , H HU . Face recognition of triple loss fine-tuning in angular space . Chinese Journal of Liquid Crystals and Displays , 2019 . 34 ( 1 ): 110 - 117 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs201901015 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs201901015 .

C PENG , N WANG , J LI , 等 . Re-ranking high-dimensional deep local representation for NIR-VIS face recognition . IEEE Transactions on Image Processing , 2019 . 28 ( 9 ): 4553 - 4565 . DOI: 10.1109/TIP.2019.2912360 http://doi.org/10.1109/TIP.2019.2912360 .

MESSER K, KITTLER J, SADEGHI M, et al .. Face verification competition on the XM2VTS database[C]. International Conference on Audio-and Video-Based Biometric Person Authentication , Springer, 2003: 964-974.

ZHANG M, WANG N, GAO X, et al .. Markov Random Neural Fields for Face Sketch Synthesis[C]. International Joint Conference on Artificial Intelligence , 2018: 1142-1148.

SONG Y, BAO L, YANG Q, et al .. Real-time exemplar-based face sketch synthesis[C]. European Conference on Computer Vision, Springer , 2014: 800-813.

N WANG , X GAO , J LI . Random sampling for fast face sketch synthesis . Pattern Recognition , 2018 . 76 ( 1 ): 215 - 227 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=effcdf90d774c6f7f26b519e7325b176 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=effcdf90d774c6f7f26b519e7325b176 .

ZHANG L, LIN L, WU X, et al .. End-to-end photo-sketch generation via fully convolutional representation learning[C]. The 5th ACM on International Conference on Multimedia Retrieval, ACM Press , 2015: 627-634.

GOODFELLOW I, POUGET-ADADIE J, MIRZA M, et al .. Generative adversarial nets[C]. Advances in neural information processing systems, MIT Press , 2014: 2672-2680.

ZHU J Y, PARK T, ISOLA P, et al .. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. IEEE Conference on Computer Vision and Pattern Recognition, Piscataway: IEEE , 2017: 2223-2232.

YI Z, ZHANG H, TAN P, et al .. Dualgan: Unsupervised dual learning for image-to-image translation[C]. IEEE Conference on Computer Vision and Pattern Recognition, IEEE , 2017: 2849-2857.

KANCHARAGUNTA K B, DUBEY S R. Csgan: cyclic-synthesized generative adversarial networks for image-to-image transformation[J]. arXiv preprint arXiv: 1901.03554, 2019.

J ZHENG , W SONG , Y WU , 等 . Feature encoder guided generative adversarial network for face photo-sketch synthesis . IEEE Access , 2019 . 7 ( 1 ): 154971 - 154985 . http://ieeexplore.ieee.org/document/8880651 http://ieeexplore.ieee.org/document/8880651 .

浏览量

1218

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据