MHAN: Multi-head hybrid attention network for facial expression recognition

  • Xiaofeng Wang*
  • , Tianbo Han
  • , Songliang Liu
  • , Muhammad Shahroz Ajmal
  • , Lu Chen
  • , Yongqin Zhang
  • , YONGHUAI LIU
  • *Corresponding author for this work

Research output: Contribution to journalArticle (journal)peer-review

Abstract

Integrating Facial Expression Recognition (FER) with deep learning techniques has
significantly enhanced emotion analysis performance in the past decade. Convolutional
neural networks (CNNs) and attention mechanisms facilitate the automatic extraction
of complex features from facial expressions. However, current methods often face
challenges in accurately capturing subtle variations in expressions, tend to be computationally
intensive, and are susceptible to overfitting. To address these challenges,
this paper proposes a lightweight FER model based on multi-head hybrid attention
networks (MHAN). It designs two innovative modules: efficient local attention mixed
feature network (ELA-MFN) and multi-head hybrid attention mechanism (MHAtt).
The former integrates multi-scale convolutional kernels with the ELA attention mechanism
to enhance feature representation while ensuring precise localization of critical
areas, all within a lightweight framework. The latter utilizes multiple attention
heads to generate attention maps and capture subtle distinctions in expressions. With
only 4.27M parameters (94% reduction from POSTER’s 71.8M), MHAN effectively
reduces computational resource requirements, and can be efficiently implemented for
both fully supervised and semi-supervised learning tasks. And it employs a smooth label loss function solving overfitting issue. We have validated the effectiveness of
MHAN over three public datasets RAF-DB, AffectNet, and FERPlus, including crossdataset
tests. The results show that MHAN outperforms state-of-the-art models in
terms of accuracy and computational complexity, demonstrating improved robustness.
MHAN can also recognize the expressions of non-traditional datasets like sculptures,
validating its cross-domain generalization capabilities. The source code is available at
https://github.com/hanyao666/MHAN.
Original languageEnglish
Article number112015
Pages (from-to)1-12
Number of pages12
JournalPattern Recognition
Volume170
Issue number2026
Early online date3 Jul 2025
Publication statusPublished - 28 Feb 2026

Keywords

  • Facial expression recognition
  • efficient local attention
  • multi-head hybrid attention
  • smooth label loss function
  • attention map
  • Efficient local attention
  • Multi-head hybrid attention
  • Smooth label loss function
  • Attention map

Fingerprint

Dive into the research topics of 'MHAN: Multi-head hybrid attention network for facial expression recognition'. Together they form a unique fingerprint.

Cite this