Attention over Attention: An Enhanced Supervised Video Summarization Approach

Isha Puthige, T. Hussain, Suneet Gupta, M. Agarwal

Research output: Chapter in Book/Report/Conference proceedingConference proceeding (ISBN)peer-review

6 Citations (Scopus)

Abstract

Now a days, Video Summarization is highly explored research area by the research community, which has off late gained a lot of interest from researchers globally. It comprises extraction of important frames from the big size video input in order to extract the important events from the video into a much smaller yet comprehensively summarized form. Towards effective video summarization, a novel methodology is proposed in this paper, that works on AoA (attention over attention) strategy. The proposed deep model is based on multiple attention modules including spatial, channel, and multi-headed attention. The AoA enables us to capture inter spatial and inter channel relationship between the features effectively. The proposed attention module is applied over the set of existing features from video summarization datasets. Progressively applying attention ensures to highlight the most important contents from the input frames, thereby producing more effective key frames. Several ablation studies have also been performed to analyze the position spatial and channel attention to determine the best possible architecture based on experiments alongside the theoretical proofs of AoA architecture. For the experimental work, two benchmark dataset have been used and compared the performance with existing methods.
Original languageEnglish
Title of host publicationProcedia Computer Science
EditorsVijendra Singh
PublisherScience Direct
Pages2359–2368
Number of pages10
Volume218
ISBN (Electronic)1877-0509
DOIs
Publication statusPublished - 31 Dec 2022

Publication series

NameProcedia Computer Science
PublisherElsevier BV
ISSN (Print)1877-0509

Keywords

  • Summarization of video
  • Deep Learning
  • CNN
  • Attention Module

Fingerprint

Dive into the research topics of 'Attention over Attention: An Enhanced Supervised Video Summarization Approach'. Together they form a unique fingerprint.

Cite this