TY - JOUR
T1 - FSSDroid: Feature subset selection for Android malware detection
AU - Polatidis, Nikolaos
AU - Kapetanakis, stelios
AU - TROVATI, MARCELLO
AU - KORKONTZELOS, YANNIS
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/7/16
Y1 - 2024/7/16
N2 - Android malware has become an increasingly important threat to individuals, organizations, and society, posing significant risks to data security, privacy, and infrastructure. As malware evolves in sophistication and complexity, the detection and mitigation of these malicious software instances have become more challenging and time consuming since the required number of features to identify potential malware can be very high. To address this issue, we have developed an effective feature selection methodology for malware detection in Android. The critical concern in the field of malware detection is the complexity of algorithms and the use of features that are used to detect malware. The present paper delivers a methodology for pre-processing datasets to select the most optimal features that will allow detecting malware, while maintaining very high accuracy. The proposed methodology has been tested on two real world datasets and the results indicate that the number of features is significantly reduced from 489 to between 19 and 28 for the first dataset and from 9503 to between 9 and 27 for the second dataset, whilst the accuracy is maintained as if all features were used.
AB - Android malware has become an increasingly important threat to individuals, organizations, and society, posing significant risks to data security, privacy, and infrastructure. As malware evolves in sophistication and complexity, the detection and mitigation of these malicious software instances have become more challenging and time consuming since the required number of features to identify potential malware can be very high. To address this issue, we have developed an effective feature selection methodology for malware detection in Android. The critical concern in the field of malware detection is the complexity of algorithms and the use of features that are used to detect malware. The present paper delivers a methodology for pre-processing datasets to select the most optimal features that will allow detecting malware, while maintaining very high accuracy. The proposed methodology has been tested on two real world datasets and the results indicate that the number of features is significantly reduced from 489 to between 19 and 28 for the first dataset and from 9503 to between 9 and 27 for the second dataset, whilst the accuracy is maintained as if all features were used.
KW - Android
KW - Malware detection
KW - Feature selection
KW - Machine learning
KW - Binarization
KW - Pre-processing
UR - http://www.scopus.com/inward/record.url?scp=85198657083&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85198657083&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/42a52ece-7bcd-301d-9456-e657951d5fed/
U2 - 10.1007/s11280-024-01287-y
DO - 10.1007/s11280-024-01287-y
M3 - Article (journal)
VL - 27
SP - 1
EP - 17
JO - World Wide Web: Internet and Web Information Systems
JF - World Wide Web: Internet and Web Information Systems
IS - 5
M1 - 50
ER -