Abstract
Movie data has a prominent role in the exponential growth of multimedia data over the Internet, and its analysis has become a hot topic with computer vision. The initial step towards movie analysis is scene segmentation. In this article, we investigated this problem through a novel intelligent Convolutional Neural Network (CNN) based three folded framework. The first fold segments the input movie into shots, the second fold detects objects in the segmented shots and the third fold performs object-based shots matching for detecting scene boundaries. Texture and shape features are fused for shots segmentation, and each shot is represented by a set of detected objects acquired from a light-weight CNN model. Finally, we apply set theory with the sliding window–based approach to integrate the same shots to decide scene boundaries. The experimental evaluation indicates that our proposed approach outran the existing movie scene segmentation approaches.
Original language | English |
---|---|
Pages (from-to) | 1-8 |
Journal | International Journal of Distributed Sensor Networks |
Volume | 15 |
Issue number | 6 |
DOIs | |
Publication status | Published - 25 Jun 2019 |
Keywords
- Movie analysis
- multi-level decision making
- scene segmentation
- shot segmentation
- information fusion
- object detection
- set theory