Abstract
In this paper, we present a novel method to explore semantically meaningful visual information and identify the discriminative spatiotemporal relationships between them for real-time activity recognition. Our approach infers human activities using continuous egocentric (first-person-view) videos of object manipulations in an industrial setup. In order to achieve this goal, we propose a random forest that unifies randomization, discriminative relationships mining and a Markov temporal structure. Discriminative relationships mining helps us to model relations that distinguish different activities, while randomization allows us to handle the large feature space and prevents over-fitting. The Markov temporal structure provides temporally consistent decisions during testing. The proposed random forest uses a discriminative Markov decision tree, where every nonterminal node is a discriminative classifier and the Markov structure is applied at leaf nodes. The proposed approach outperforms the state-of-the-art methods on a new challenging video dataset of assembling a pump system.
Original language | English |
---|---|
Pages | 1-13 |
Publication status | Published - 1 Sept 2014 |
Event | 25th British Machine Vision Conference - Nottingham, United Kingdom Duration: 1 Sept 2014 → 5 Sept 2014 |
Conference
Conference | 25th British Machine Vision Conference |
---|---|
Country/Territory | United Kingdom |
City | Nottingham |
Period | 1/09/14 → 5/09/14 |