Abstract
Feature extraction plays a vital role in visual action
recognition. Many existing gradient-based feature extractors,
including histogram of oriented gradients (HOG), histogram of
optical flow (HOF), motion boundary histograms (MBH), and
histogram of motion gradients (HMG), build histograms for
representing different actions over the spatio-temporal domain
in a video. However, these methods require to set the number of
bins for information aggregation in advance. Varying numbers
of bins usually lead to inherent uncertainty within the process
of pixel voting with regard to the bins in the histogram. This
paper proposes a novel method to handle such uncertainty by
fuzzifying these feature extractors. The proposed approach has
two advantages: i) it better represents the ambiguous boundaries
between the bins and thus the fuzziness of the spatio-temporal
visual information entailed in videos, and ii) the contribution
of each pixel is flexibly controlled by a fuzziness parameter for
various scenarios. The proposed family of fuzzy descriptors and
a combination of them were evaluated on two publicly available
datasets, demonstrating that the proposed approach outperforms
the original counterparts and other state-of-the-art methods.
recognition. Many existing gradient-based feature extractors,
including histogram of oriented gradients (HOG), histogram of
optical flow (HOF), motion boundary histograms (MBH), and
histogram of motion gradients (HMG), build histograms for
representing different actions over the spatio-temporal domain
in a video. However, these methods require to set the number of
bins for information aggregation in advance. Varying numbers
of bins usually lead to inherent uncertainty within the process
of pixel voting with regard to the bins in the histogram. This
paper proposes a novel method to handle such uncertainty by
fuzzifying these feature extractors. The proposed approach has
two advantages: i) it better represents the ambiguous boundaries
between the bins and thus the fuzziness of the spatio-temporal
visual information entailed in videos, and ii) the contribution
of each pixel is flexibly controlled by a fuzziness parameter for
various scenarios. The proposed family of fuzzy descriptors and
a combination of them were evaluated on two publicly available
datasets, demonstrating that the proposed approach outperforms
the original counterparts and other state-of-the-art methods.
Original language | English |
---|---|
Article number | 8919994 |
Pages (from-to) | 4059-4067 |
Number of pages | 9 |
Journal | IEEE Transactions on Industrial Informatics |
Volume | 16 |
Issue number | 6 |
Early online date | 3 Dec 2019 |
DOIs | |
Publication status | Published - Jun 2020 |
Keywords
- Video feature extraction
- histogram
- local feature descriptors
- fuzziness
- action recognition
- Action recognition
- video feature extraction
Research Groups
- Visual Computing Lab