Attentional Learn-able Pooling for Human Activity Recognition


Research output: Contribution to journalConference proceeding article (ISSN)peer-review

2 Citations (Scopus)
242 Downloads (Pure)


Human activity/behaviour monitoring and recognition is a key for facilitating human robot interaction and allows robots for a better scheduling of future operations. It is challenging and often addressed at different levels, such as human activity classification, future activity prediction and monitoring of the on-going activities. The paper proposes a novel attention-based learn-able pooling mechanism for human activity classification from RGB videos. Recently, most of the best performing human activity recognition approaches are based on 3D skeleton positions. The 3D skeleton positions are not always available in videos captured using RGB cameras, which are widely used in robotics applications. RGB videos contain rich spatio-temporal information and processing them semantically is a difficult task. Moreover, accurately capturing spatial information and long-term temporal dependencies is the key to achieving high recognition accuracy. We use an existing Convolutional Neural Network for image recognition to extract video features which are then processed using our innovative application of attention mechanism to focus the network on features that are more important for discrimination. Afterwards, we use a novel learn-able pooling mechanism to extract activity-aware spatio-temporal cues for efficient activity recognition. The proposed pooling mechanism learns the structural information from hidden states of a bidirectional Long Short-Term Memory network via Fisher Vectors.
Original languageEnglish
JournalProceedings - IEEE International Conference on Robotics and Automation
Publication statusPublished - 5 Jun 2021
EventIEEE International Conference on Robotics and Automation - , China
Duration: 30 May 20215 Jun 2021


  • Human-Robot Interaction
  • Human Activity Recognition
  • Deep Learning
  • Convolutional Neural Network
  • Attentional pooling
  • Bi-directional LSTM
  • Fisher Vectors
  • Activity-Aware Pooling

Research Institutes

  • Health Research Institute

Research Centres

  • Centre for Intelligent Visual Computing Research
  • Data and Complex Systems Research Centre
  • Data Science STEM Research Centre


Dive into the research topics of 'Attentional Learn-able Pooling for Human Activity Recognition'. Together they form a unique fingerprint.

Cite this