Comparing Multiclass, Binary, and Hierarchical Machine Learning Classification schemes for variable stars

Zafiirah Hosenie, ROBERT LYON, Benjamin Stappers, Arrykrishna Mootoovaloo

Research output: Contribution to journalArticle (journal)peer-review

19 Citations (Scopus)
147 Downloads (Pure)


Upcoming synoptic surveys are set to generate an unprecedented amount of data. This requires an automatic framework that can quickly and efficiently provide classification labels for several new object classification challenges. Using data describing 11 types of variable stars from the Catalina Real-Time Transient Survey (CRTS), we illustrate how to capture the most important information from computed features and describe detailed methods of how to robustly use information theory for feature selection and evaluation. We apply three machine learning algorithms and demonstrate how to optimize these classifiers via cross-validation techniques. For the CRTS data set, we find that the random forest classifier performs best in terms of balanced accuracy and geometric means. We demonstrate substantially improved classification results by converting the multiclass problem into a binary classification task, achieving a balanced-accuracy rate of ∼99 per cent for the classification of δ Scuti and anomalous Cepheids. Additionally, we describe how classification performance can be improved via converting a ‘flat multiclass’ problem into a hierarchical taxonomy. We develop a new hierarchical structure and propose a new set of classification features, enabling the accurate identification of subtypes of Cepheids, RR Lyrae, and eclipsing binary stars in CRTS data.
Original languageEnglish
Pages (from-to)4858–4872
Number of pages15
JournalMonthly Notices of the Royal Astronomical Society
Issue number4
Early online date25 Jul 2019
Publication statusPublished - 1 Oct 2019


  • Methods: data analysis
  • Methods:statistical stars
  • variables:general
  • methods: statistical
  • methods: data analysis
  • stars: variables: general


Dive into the research topics of 'Comparing Multiclass, Binary, and Hierarchical Machine Learning Classification schemes for variable stars'. Together they form a unique fingerprint.

Cite this