Abstract
Multiview video summarization (MVS) has not received much attention from the research community due to inter-view correlations and views' overlapping, etc. The majority of previous MVS works are offline, relying on only summary, and require additional communication bandwidth and transmission time, with no focus on foggy environments. We propose an edge intelligence-based MVS and activity recognition framework that combines artificial intelligence with Internet of Things (IoT) devices. In our framework, resource-constrained devices with cameras use a lightweight CNN-based object detection model to segment multiview videos into shots, followed by mutual information computation that helps in a summary generation. Our system does not rely solely on a summary, but encodes and transmits it to a master device using a neural computing stick for inter-view correlations computation and efficient activity recognition, an approach which saves computation resources, communication bandwidth, and transmission time. Experiments show an increase of 0.4 unit in F -measure on an MVS Office dateset and 0.2% and 2% improved accuracy for UCF-50 and YouTube 11 datesets, respectively, with lower storage and transmission times. The processing time is reduced from 1.23 to 0.45 s for a single frame and optimally 0.75 seconds faster MVS. A new dateset is constructed by synthetically adding fog to an MVS dateset to show the adaptability of our system for both certain and uncertain IoT surveillance environments.
Original language | English |
---|---|
Article number | 9208765 |
Pages (from-to) | 9634-9644 |
Number of pages | 11 |
Journal | IEEE Internet of Things Journal |
Volume | 8 |
Issue number | 12 |
Early online date | 29 Sept 2020 |
DOIs | |
Publication status | Published - 7 Jun 2021 |
Keywords
- Activity recognition
- Internet of Things (IoT)
- deep autoencoder
- deep learning
- multiview video summarization (MVS)
- sequential learning
- video data analytics
- video summarization