Deep CNN, Body Pose and Body-Object Interaction Features for Drivers' Activity Monitoring


Research output: Contribution to journalArticle (journal)peer-review

390 Downloads (Pure)


Automatic recognition and prediction of in-vehicle human activities has a significant impact on the next generation of driver assistance and intelligent autonomous vehicles. In this paper, we present a novel single image driver action recognition algorithm inspired by human perception that often focuses selectively on parts of the images to acquire information at specific places which are distinct to a given task. Unlike existing approaches, we argue that human activity is a combination of pose and semantic contextual cues. In detail, we model this by considering the configuration of body joints, their interaction with objects being represented as a pairwise relation to capture the structural information. Our body-pose and body-object interaction representation is built to be semantically rich and meaningful, and is highly discriminative even though it is coupled with a basic linear SVM classifier. We also propose a Multi-stream Deep Fusion Network (MDFN) for combining high-level semantics with CNN features. Our experimental results demonstrate that the proposed approach significantly improves the drivers’ action recognition accuracy on two exacting datasets.
Original languageEnglish
Pages (from-to)1-8
Number of pages8
JournalIEEE Transactions on Intelligent Transportation Systems
Early online date12 Oct 2020
Publication statusPublished - 12 Oct 2020


  • Transfer Learning
  • Intelligent Vehicles
  • Deep Learning
  • CNN
  • Body pose
  • Autonomous Vehicles
  • In-vehicle Activity Monitoring
  • Neural network-based fusion


Dive into the research topics of 'Deep CNN, Body Pose and Body-Object Interaction Features for Drivers' Activity Monitoring'. Together they form a unique fingerprint.

Cite this