3D Convolutional Neural Networks Based Speker Identification and Authentication

摘要

Research shows that human lips can be used as a new kind of biometrics in personal identification and authentication. In this letter, a novel end-to-end method based on 3D convolutional neural network (3DCNN) is proposed to extract discriminative spatiotemporal features from raw lip video streams. In our approach, the lip video is first divided into a series of overlapping clips. For each clip, the lip-characteristics network is proposed to characterize the minutiae of the lip region and its movement. Finally, the entire lip video is represented by a set of sub-features corresponding to each clip in it. Experiments have been performed on a dataset with 200 speakers and the proposed method achieves high identification accuracy of 99.18% and very low authentication error (HTER of 0.15%). Compared with several state-of-the-art methods, our approach achieves better per- formance and higher robustness against variations caused by different speaker’s pose and position.

查看全部
In Proceedings of IEEE International Conference on Image Processing 2018
廖建国
廖建国
毕业生
王士林
王士林
教授
张兴璇
张兴璇
毕业生