Hybrid Inception-ViT Networks for Fine-Grained Single-Cell Image Classification

Research output: Chapter in Book/Report/Conference proceedingConference proceeding (ISBN)peer-review

Abstract

Accurate Single-cell (SC) image classification is critical for characterizing cellular heterogeneity and supporting disease diagnostics. Conventional convolutional models often struggle due to limited data, subtle morphological differences between cell types, and class imbalance. In this work, we propose a Hybrid Inception Vision Transformer (HiViT) that combines Inception convolutional feature extraction with transformer-based attention mechanism to capture both
fine-grained texture and long-range structural context. Our framework incorporates adaptive uncertainty-aware learning via Monte Carlo dropout and data balancing through augmentation. We evaluate HiViT on the White Blood
Cell (WBC) classification Berkeley SC Computational Microscopy (BSCCM) dataset, covering Lymphocyte, Granulocyte, and Monocyte classes. The model achieves overall superior performance compared to classical machine learning
and deep learning baselines, with class-wise recalls of 90.31% (Lymphocyte), 97.97% (Granulocyte), and 81.21% (Monocyte). Experiments highlight the effectiveness of hybrid CNN–ViT architectures for robust and uncertainty-aware
SC classification, providing a foundation for extending to other biomedical image-driven analysis and diagnostic tasks.
Original languageEnglish
Title of host publicationIEEE International Symposium on Biomedical Imaging (ISBI)
PublisherIEEE
Publication statusAccepted/In press - 13 Jan 2026

Keywords

  • Artificial Intelligence (AI)
  • Computer Vision
  • Vision transformer
  • Single Cell Classification
  • convolutional neural network (CNN)
  • Fine-Grained Visual Recognition

Fingerprint

Dive into the research topics of 'Hybrid Inception-ViT Networks for Fine-Grained Single-Cell Image Classification'. Together they form a unique fingerprint.

Cite this