A deep framework for automatic annotation with application to retail warehouses

Kanika Mahajan, Anima Majumder, Harika Nanduri, Swagat Kumar

Research output: Contribution to conferencePaper

Abstract

The paper presents a novel deep learning framework for automatic annotation and segmentation of densely cluttered objects in a warehouse application use-case as specified by the Amazon Robotics Challenge (ARC) 2017. This framework addresses two challenges of the competition: (1) reducing the amount of manual labour involved in generating a large number of annotated data that could be used for training a deep network and, (2) achieving good segmentation accuracy in a very limited amount of training time ( 30 minutes). These two problems are solved by proposing a deep architecture comprising of Residual Network and Feature Pyramidal based convolutional neural network that helps to retain primitive features along with higher level features obtained from each successive layer. In addition, a framework is proposed using this network to automatically generate a large annotated dataset having different degrees of clutters to carry out multi-class semantic segmentation after training with this machine generated dataset. The proposed framework is shown to provide better segmentation accuracy with lesser training time as compared to the existing state-of-the-art architectures such as PSPNet and Mask R-CNN. The overall working of the proposed architecture is explained by creating a new dataset from the objects specified by the ARC competition. An extensive experiment is also performed using the MIT-Princeton database [22]. Our TCS-ARCDataset [13]is made available online for the convenience of readers.

Original languageEnglish
Publication statusPublished - 1 Jan 2019
Event29th British Machine Vision Conference, BMVC 2018 - Newcastle, United Kingdom
Duration: 3 Sep 20186 Sep 2018

Conference

Conference29th British Machine Vision Conference, BMVC 2018
CountryUnited Kingdom
CityNewcastle
Period3/09/186/09/18

Fingerprint

Warehouses
Robotics
Masks
Semantics
Personnel
Neural networks
Experiments

Keywords

  • computer vision
  • Deep learning
  • large dataset
  • Multilayer neural networks
  • Semantics
  • Warehouses

Cite this

Mahajan, K., Majumder, A., Nanduri, H., & Kumar, S. (2019). A deep framework for automatic annotation with application to retail warehouses. Paper presented at 29th British Machine Vision Conference, BMVC 2018, Newcastle, United Kingdom.
Mahajan, Kanika ; Majumder, Anima ; Nanduri, Harika ; Kumar, Swagat. / A deep framework for automatic annotation with application to retail warehouses. Paper presented at 29th British Machine Vision Conference, BMVC 2018, Newcastle, United Kingdom.
@conference{8df8cf78c1cd4407a1f20a98d81d6a17,
title = "A deep framework for automatic annotation with application to retail warehouses",
abstract = "The paper presents a novel deep learning framework for automatic annotation and segmentation of densely cluttered objects in a warehouse application use-case as specified by the Amazon Robotics Challenge (ARC) 2017. This framework addresses two challenges of the competition: (1) reducing the amount of manual labour involved in generating a large number of annotated data that could be used for training a deep network and, (2) achieving good segmentation accuracy in a very limited amount of training time ( 30 minutes). These two problems are solved by proposing a deep architecture comprising of Residual Network and Feature Pyramidal based convolutional neural network that helps to retain primitive features along with higher level features obtained from each successive layer. In addition, a framework is proposed using this network to automatically generate a large annotated dataset having different degrees of clutters to carry out multi-class semantic segmentation after training with this machine generated dataset. The proposed framework is shown to provide better segmentation accuracy with lesser training time as compared to the existing state-of-the-art architectures such as PSPNet and Mask R-CNN. The overall working of the proposed architecture is explained by creating a new dataset from the objects specified by the ARC competition. An extensive experiment is also performed using the MIT-Princeton database [22]. Our TCS-ARCDataset [13]is made available online for the convenience of readers.",
keywords = "computer vision, Deep learning, large dataset, Multilayer neural networks, Semantics, Warehouses",
author = "Kanika Mahajan and Anima Majumder and Harika Nanduri and Swagat Kumar",
year = "2019",
month = "1",
day = "1",
language = "English",
note = "29th British Machine Vision Conference, BMVC 2018 ; Conference date: 03-09-2018 Through 06-09-2018",

}

Mahajan, K, Majumder, A, Nanduri, H & Kumar, S 2019, 'A deep framework for automatic annotation with application to retail warehouses' Paper presented at 29th British Machine Vision Conference, BMVC 2018, Newcastle, United Kingdom, 3/09/18 - 6/09/18, .

A deep framework for automatic annotation with application to retail warehouses. / Mahajan, Kanika; Majumder, Anima; Nanduri, Harika; Kumar, Swagat.

2019. Paper presented at 29th British Machine Vision Conference, BMVC 2018, Newcastle, United Kingdom.

Research output: Contribution to conferencePaper

TY - CONF

T1 - A deep framework for automatic annotation with application to retail warehouses

AU - Mahajan, Kanika

AU - Majumder, Anima

AU - Nanduri, Harika

AU - Kumar, Swagat

PY - 2019/1/1

Y1 - 2019/1/1

N2 - The paper presents a novel deep learning framework for automatic annotation and segmentation of densely cluttered objects in a warehouse application use-case as specified by the Amazon Robotics Challenge (ARC) 2017. This framework addresses two challenges of the competition: (1) reducing the amount of manual labour involved in generating a large number of annotated data that could be used for training a deep network and, (2) achieving good segmentation accuracy in a very limited amount of training time ( 30 minutes). These two problems are solved by proposing a deep architecture comprising of Residual Network and Feature Pyramidal based convolutional neural network that helps to retain primitive features along with higher level features obtained from each successive layer. In addition, a framework is proposed using this network to automatically generate a large annotated dataset having different degrees of clutters to carry out multi-class semantic segmentation after training with this machine generated dataset. The proposed framework is shown to provide better segmentation accuracy with lesser training time as compared to the existing state-of-the-art architectures such as PSPNet and Mask R-CNN. The overall working of the proposed architecture is explained by creating a new dataset from the objects specified by the ARC competition. An extensive experiment is also performed using the MIT-Princeton database [22]. Our TCS-ARCDataset [13]is made available online for the convenience of readers.

AB - The paper presents a novel deep learning framework for automatic annotation and segmentation of densely cluttered objects in a warehouse application use-case as specified by the Amazon Robotics Challenge (ARC) 2017. This framework addresses two challenges of the competition: (1) reducing the amount of manual labour involved in generating a large number of annotated data that could be used for training a deep network and, (2) achieving good segmentation accuracy in a very limited amount of training time ( 30 minutes). These two problems are solved by proposing a deep architecture comprising of Residual Network and Feature Pyramidal based convolutional neural network that helps to retain primitive features along with higher level features obtained from each successive layer. In addition, a framework is proposed using this network to automatically generate a large annotated dataset having different degrees of clutters to carry out multi-class semantic segmentation after training with this machine generated dataset. The proposed framework is shown to provide better segmentation accuracy with lesser training time as compared to the existing state-of-the-art architectures such as PSPNet and Mask R-CNN. The overall working of the proposed architecture is explained by creating a new dataset from the objects specified by the ARC competition. An extensive experiment is also performed using the MIT-Princeton database [22]. Our TCS-ARCDataset [13]is made available online for the convenience of readers.

KW - computer vision

KW - Deep learning

KW - large dataset

KW - Multilayer neural networks

KW - Semantics

KW - Warehouses

UR - http://www.scopus.com/inward/record.url?scp=85072336987&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072336987&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85072336987

ER -

Mahajan K, Majumder A, Nanduri H, Kumar S. A deep framework for automatic annotation with application to retail warehouses. 2019. Paper presented at 29th British Machine Vision Conference, BMVC 2018, Newcastle, United Kingdom.