TY - JOUR
T1 - An intelligent correlation learning system for person Re-identification
AU - Khan, Samee Ullah
AU - Khan, Noman
AU - Hussain, Tanveer
AU - Baik, Sung Wook
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2024/1
Y1 - 2024/1
N2 - Person re-identification (PRe-id) aims to retrieve a target person's images captured across multiple/single non-overlapping cameras. To this end, significant techniques have been implemented that extract handcrafted, deep, part-based, and ensemble features to get more refined patterns for matching. But due to the limited focus on the multi-grained, view-consistent, and semantic correlation among different views, these approaches show low performance. Therefore, we present an attention-based multi-view correlation learning framework named (ACLS), which enables to learn multi-grained spatiotemporal features from individuals. The ACLS is mainly composed of three key steps: First, multi-view correlated visual features of pedestrians are extracted using a correlation vision transformer (CVIT) and a pyramid dilated network (PDN), followed by the person attention mechanism. Next, we employ convolutional long short-term memory (ConvLSTM) to extract spatiotemporal information from pedestrian images captured in different time frames. Finally, a deep fusion strategy is employed, which intelligently integrates features for the final matching task. Extensive evaluations are conducted over three famous datasets: Market-1501, DukeMCMT-reID, and CUHK03, results show tremendous ranking performance including 93.7%, 90.4%, and 85.7%. Thus, concluded remarks that our learning mechanism beats the current state-of-the-art (SOTA) methods.
AB - Person re-identification (PRe-id) aims to retrieve a target person's images captured across multiple/single non-overlapping cameras. To this end, significant techniques have been implemented that extract handcrafted, deep, part-based, and ensemble features to get more refined patterns for matching. But due to the limited focus on the multi-grained, view-consistent, and semantic correlation among different views, these approaches show low performance. Therefore, we present an attention-based multi-view correlation learning framework named (ACLS), which enables to learn multi-grained spatiotemporal features from individuals. The ACLS is mainly composed of three key steps: First, multi-view correlated visual features of pedestrians are extracted using a correlation vision transformer (CVIT) and a pyramid dilated network (PDN), followed by the person attention mechanism. Next, we employ convolutional long short-term memory (ConvLSTM) to extract spatiotemporal information from pedestrian images captured in different time frames. Finally, a deep fusion strategy is employed, which intelligently integrates features for the final matching task. Extensive evaluations are conducted over three famous datasets: Market-1501, DukeMCMT-reID, and CUHK03, results show tremendous ranking performance including 93.7%, 90.4%, and 85.7%. Thus, concluded remarks that our learning mechanism beats the current state-of-the-art (SOTA) methods.
KW - Artificial intelligence
KW - Big data
KW - Computer vision
KW - Correlation learning
KW - Dilated convolutional network
KW - Image retrieval
KW - Information fusion
KW - Intelligent system
KW - Pedestrian tracking
KW - Person re-identification
UR - http://www.scopus.com/inward/record.url?scp=85177885810&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85177885810&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/39b2e031-0930-3359-bfef-d82035ca57a0/
U2 - 10.1016/j.engappai.2023.107213
DO - 10.1016/j.engappai.2023.107213
M3 - Article (journal)
AN - SCOPUS:85177885810
SN - 0952-1976
VL - 128
SP - 1
EP - 15
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 107213
ER -