Imbalance Learning for Variable Star Classification

ROBERT LYON, Zafiirah Hosenie, Arrykrishna Mootoovaloo, B W Stappers, Vanessa McBride

Research output: Contribution to journalArticle (journal)peer-review

107 Downloads (Pure)


The accurate automated classification of variable stars into their respective sub-types is difficult. Machine learning based solutions often fall foul of the imbalanced learning problem, which causes poor generalisation performance in practice, especially on rare variable star sub-types. In previous work, we attempted to overcome such deficiencies via the development of a hierarchical machine learning classifier. This 'algorithm-level' approach to tackling imbalance, yielded promising results on Catalina Real-Time Survey (CRTS) data, outperforming the binary and multi-class classification schemes previously applied in this area. In this work, we attempt to further improve hierarchical classification performance by applying 'data-level' approaches to directly augment the training data so that they better describe under-represented classes. We apply and report results for three data augmentation methods in particular: Randomly Augmented Sampled Light curves from magnitude Error (RASLE), augmenting light curves with Gaussian Process modelling (GpFit) and the Synthetic Minority Over-sampling Technique (SMOTE). When combining the 'algorithm-level' (i.e. the hierarchical scheme) together with the 'data-level' approach, we further improve variable star classification accuracy by 1-4%. We found that a higher classification rate is obtained when using GpFit in the hierarchical model. Further improvement of the metric scores requires a better standard set of correctly identified variable stars and, perhaps enhanced features are needed.
Original languageEnglish
Pages (from-to)6050-6059
Number of pages11
JournalMonthly Notices of the Royal Astronomical Society
Issue number4
Early online date13 Mar 2020
Publication statusE-pub ahead of print - 13 Mar 2020


  • Astronomy


Dive into the research topics of 'Imbalance Learning for Variable Star Classification'. Together they form a unique fingerprint.

Cite this