TY - GEN
T1 - An Investigation on Fragility of Machine Learning Classifiers in Android Malware Detection
AU - RAFIQ, HUSNAIN
AU - Aslam, Nauman
AU - Randhawa, Rizwan Hamid
AU - Issac, Biju
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022/5/22
Y1 - 2022/5/22
N2 - Machine learning (ML) classifiers have been increasingly used in Android malware detection and countermeasures for the past decade. However, ML-based solutions are vulnerable to adversarial evasion attacks. An attacker can craft a malicious sample carefully to fool an underlying pre-trained classifier. In this paper, we highlight the fragility of the ML classifiers against adversarial evasion attacks. We perform mimicry attacks based on Oracle and Generative Adversarial Network (GAN) against these classifiers using our proposed methodology. We use static analysis on Android applications to extract API-based features from a balanced excerpt of a well-known public dataset. The empirical results demonstrate that among ML classifiers, the detection capability of linear classifiers can be reduced as low as 0% by perturbing only up to 4 out of 315 extracted API features. As a countermeasure, we propose TrickDroid, a cumulative adversarial training scheme based on Oracle and GAN-based adversarial data to improve evasion detection. The experimental results of cumulative adversarial training achieves a remarkable detection accuracy of up to 99.46% against adversarial samples.
AB - Machine learning (ML) classifiers have been increasingly used in Android malware detection and countermeasures for the past decade. However, ML-based solutions are vulnerable to adversarial evasion attacks. An attacker can craft a malicious sample carefully to fool an underlying pre-trained classifier. In this paper, we highlight the fragility of the ML classifiers against adversarial evasion attacks. We perform mimicry attacks based on Oracle and Generative Adversarial Network (GAN) against these classifiers using our proposed methodology. We use static analysis on Android applications to extract API-based features from a balanced excerpt of a well-known public dataset. The empirical results demonstrate that among ML classifiers, the detection capability of linear classifiers can be reduced as low as 0% by perturbing only up to 4 out of 315 extracted API features. As a countermeasure, we propose TrickDroid, a cumulative adversarial training scheme based on Oracle and GAN-based adversarial data to improve evasion detection. The experimental results of cumulative adversarial training achieves a remarkable detection accuracy of up to 99.46% against adversarial samples.
UR - http://www.scopus.com/inward/record.url?scp=85133898950&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133898950&partnerID=8YFLogxK
U2 - 10.1109/INFOCOMWKSHPS54753.2022.9798161
DO - 10.1109/INFOCOMWKSHPS54753.2022.9798161
M3 - Conference proceeding (ISBN)
T3 - INFOCOM WKSHPS 2022 - IEEE Conference on Computer Communications Workshops
BT - IEEE INFOCOM 2022 - IEEE Conference on Computer Communications Workshops
PB - IEEE
T2 - IEEE Conference on Computer Communications Workshops
Y2 - 2 May 2022 through 5 May 2022
ER -