Visual Speaker Authentication by a CNN-Based Scheme with Discriminative Segment Analysis

摘要

Recent research shows that the static and dynamic features of a lip utterance contain abundant identity-related information. In this paper, a new deep convolutional neural network scheme is proposed. The entire lip utterance is first divided into a series of overlapping segments; then an adaptive scheme is designed to automatically examine the dis- criminative power and assign a corresponding weight of each segment in the entire utterance. The final authentication result of the entire utter- ance is determined by weighted voting of the results for all the segments. In addition, considering the various lighting condition in the natural envi- ronment, an illumination normalization procedure is proposed. Experi- mental results show that different segments of the same utterance have different discriminative power for user authentication, and focusing on the discriminative details will be more effective. The proposed method has shown superior performance compared with two state-of-the-art lip authentication approaches investigated.

查看全部
In Proceedings of International Conference on Neural Information Processing 2019
孙佳慧
孙佳慧
毕业生
王士林
王士林
教授