Abstract
Determining key performance indicators and classifying players accurately between competitive levels
is one of the classification challenges in sports analytics. A recent study applied Random Forest algorithm
to identify important variables to classify rugby league players into academy and senior levels and
achieved 82.0% and 67.5% accuracy for backs and forwards. However, the classification accuracy could
be improved due to limitations in the existing method. Therefore, this study aimed to introduce and
implement feature selection technique to identify key performance indicators in rugby league positional
groups and assess the performances of six classification algorithms. Fifteen and fourteen of 157 performance indicators for backs and forwards were identified respectively as key performance indicators by
the correlation-based feature selection method, with seven common indicators between the positional
groups. Classification results show that models developed using the key performance indicators had
improved performance for both positional groups than models developed using all performance indicators. 5-Nearest Neighbour produced the best classification accuracy for backs and forwards (accuracy =
85% and 77%) which is higher than the previous method’s accuracies. When analysing classification
questions in sport science, researchers are encouraged to evaluate multiple classification algorithms and
a feature selection method should be considered for identifying key variables.
is one of the classification challenges in sports analytics. A recent study applied Random Forest algorithm
to identify important variables to classify rugby league players into academy and senior levels and
achieved 82.0% and 67.5% accuracy for backs and forwards. However, the classification accuracy could
be improved due to limitations in the existing method. Therefore, this study aimed to introduce and
implement feature selection technique to identify key performance indicators in rugby league positional
groups and assess the performances of six classification algorithms. Fifteen and fourteen of 157 performance indicators for backs and forwards were identified respectively as key performance indicators by
the correlation-based feature selection method, with seven common indicators between the positional
groups. Classification results show that models developed using the key performance indicators had
improved performance for both positional groups than models developed using all performance indicators. 5-Nearest Neighbour produced the best classification accuracy for backs and forwards (accuracy =
85% and 77%) which is higher than the previous method’s accuracies. When analysing classification
questions in sport science, researchers are encouraged to evaluate multiple classification algorithms and
a feature selection method should be considered for identifying key variables.
Original language | Undefined/Unknown |
---|---|
Pages (from-to) | 68-75 |
Journal | Science and Medicine in Football |
Volume | 8 |
Issue number | 1 |
Early online date | 14 Nov 2024 |
DOIs | |
Publication status | Published - 14 Nov 2024 |