Main paper: Automatic facial expression recognition based on MobileNetV2 in Real-time
The original paper proposes a facial expression recognition (FER) system using MobileNetV2. Key techniques include:
- Face Extraction: The FaceBoxes algorithm is used to detect and crop faces, removing background noise.
- Two-Stage Fine-Tuning: MobileNetV2 is fine-tuned on FER datasets (FER2013, CK+, and JAFFE) to improve performance.
- Island Loss: A joint supervision method combining softmax and island loss enhances the model's ability to distinguish between emotions.
The system achieves 97.98% accuracy on FER2013, 91.44% on CK+, and 95.24% on JAFFE, outperforming several methods. Its real-time performance (3.87 ms/frame) makes it suitable for practical applications. Our project references this methodology for the facial module, while speech and fusion modules are developed independently.