conference

A Transformer-based Model for Sentence-Level Chinese Mandarin Lipreading

Lipreading is a task that converts silent speaker video into its speech content, which has practical value in many scenarios. However, …

马诗慧, 王士林

Feature Extraction for Visual Speaker Authentication Against Computer-Generated Video Attacks

Recent research shows that the lip feature can achieve reliable authentication performance with a good liveness detection ability. …

马骏, 王士林

Speaker-Independent Lipreading with Limited Data

Recent researches have demonstrated that with a huge annotated training dataset, some sophisticated automatic lipreading methods …

杨晨照, 张兴璇, 王士林

Visual Speech Recognition in Natural Scenes Based on Sptial Transformer Networks

In this paper, we improve the performance of visual speech recognition in natural scenes based on spatial transformer networks. Visual …

余金, 王士林

Lip Image Segmentation in Mobile Devices Based on Alternative Knowlegde Distillation

Lip image segmentation, as the first step in many lip-related tasks (e.g. automatic lipreading), is of vital significance for the …

管成, 王士林

Spatio-Temporal Fusion based Convolutional Sequence Learning for Lip Reading

Current state-of-the-art approaches for lip reading are based on sequence-to-sequence architectures that are designed for natural …

张兴璇, 王士林

Visual Speaker Authentication by a CNN-Based Scheme with Discriminative Segment Analysis

Recent research shows that the static and dynamic features of a lip utterance contain abundant identity-related information. In this …

孙佳慧, 王士林

3D Convolutional Neural Networks Based Speker Identification and Authentication

Research shows that human lips can be used as a new kind of biometrics in personal identification and authentication. In this letter, a …

廖建国, 王士林, 张兴璇