Deep CNN, Body Pose and Body-Object Interaction Features for Drivers' Activity Monitoring


Research output: Contribution to journalArticle

26 Downloads (Pure)


Automatic recognition and prediction of in-vehicle human activities has a significant impact on the next generation of driver assistance and intelligent autonomous vehicles. In this paper, we present a novel single image driver action recognition algorithm inspired by human perception that often focuses selectively on parts of the images to acquire information at specific places which are distinct to a given task. Unlike existing approaches, we argue that human activity is a combination of pose and semantic contextual cues. In detail, we model this by considering the configuration of body joints, their interaction with objects being represented as a pairwise relation to capture the structural information. Our body-pose and body-object interaction representation is built to be semantically rich and meaningful, and is highly discriminative even though it is coupled with a basic linear SVM classifier. We also propose a Multi-stream Deep Fusion Network (MDFN) for combining high-level semantics with CNN features. Our experimental results demonstrate that the proposed approach significantly improves the drivers’ action recognition accuracy on two exacting datasets.
Original languageEnglish
JournalIEEE Transactions on Intelligent Transportation Systems
Early online date12 Oct 2020
Publication statusPublished - 12 Oct 2020


  • Transfer Learning
  • Intelligent Vehicles
  • Deep Learning
  • CNN
  • Body pose
  • Autonomous Vehicles
  • In-vehicle Activity Monitoring
  • Neural network-based fusion

Fingerprint Dive into the research topics of 'Deep CNN, Body Pose and Body-Object Interaction Features for Drivers' Activity Monitoring'. Together they form a unique fingerprint.

  • Cite this