Histogram of Fuzzy Local Spatio-Temporal Descriptors for Video Action Recognition

Zheming Zuo, Longzhi Yang, YONGHUAI LIU, Fei Chao, Ran Song, Yanpeng Qu

Research output: Contribution to journalArticle (journal)peer-review

15 Citations (Scopus)
96 Downloads (Pure)


Feature extraction plays a vital role in visual action
recognition. Many existing gradient-based feature extractors,
including histogram of oriented gradients (HOG), histogram of
optical flow (HOF), motion boundary histograms (MBH), and
histogram of motion gradients (HMG), build histograms for
representing different actions over the spatio-temporal domain
in a video. However, these methods require to set the number of
bins for information aggregation in advance. Varying numbers
of bins usually lead to inherent uncertainty within the process
of pixel voting with regard to the bins in the histogram. This
paper proposes a novel method to handle such uncertainty by
fuzzifying these feature extractors. The proposed approach has
two advantages: i) it better represents the ambiguous boundaries
between the bins and thus the fuzziness of the spatio-temporal
visual information entailed in videos, and ii) the contribution
of each pixel is flexibly controlled by a fuzziness parameter for
various scenarios. The proposed family of fuzzy descriptors and
a combination of them were evaluated on two publicly available
datasets, demonstrating that the proposed approach outperforms
the original counterparts and other state-of-the-art methods.
Original languageEnglish
Article number8919994
Pages (from-to)4059-4067
Number of pages9
JournalIEEE Transactions on Industrial Informatics
Issue number6
Early online date3 Dec 2019
Publication statusPublished - Jun 2020


  • Video feature extraction
  • histogram
  • local feature descriptors
  • fuzziness
  • action recognition
  • Action recognition
  • video feature extraction

Research Groups

  • Visual Computing Lab


Dive into the research topics of 'Histogram of Fuzzy Local Spatio-Temporal Descriptors for Video Action Recognition'. Together they form a unique fingerprint.

Cite this