This article presents a framework for speedy video matching and retrieval through detection and measurement of visual similarity. The framework’s efficiency stems from its power to encode a given shot content into a compact fixed-length signature that helps in robust real-time matching. Separate scene and motion signatures are developed and fused together to fully represent and match respective video shots. Scene information is captured through the Statistical Dominant Color Profile (SDCP), while motion information is captured through a graph-based signature called the Dominant Color Graph Profile (DCGP). The SDCP is a fixed-length compact signature that statistically encodes the colors’ spatiotemporal patterns across video frames. The DCGP is a fixed-length signature that records and tracks the gray levels across subsampled video frames, where the graph structural properties are used to extract the signature values. Finally, the overall video signature is generated by fusing the individual scene and motion signatures. The signature-based aspect of the proposed framework is the key to its high matching speed (> 2000 fps) compared to current techniques that rely on exhaustive processing. To maximize the benefit of the framework, compressed-domain videos are utilized as a case study following their wide availability. However, the framework avoids full video decompression and operates on tiny frames rather than full-size decompressed frames. Experiments on various standard and challenging dataset groups show the framework’s robust performance in terms of both retrieval and computational performance.
- Centre for Intelligent Visual Computing Research