Mobile malware is increasing in complexity and maliciousness, with particular regard to the malicious samples targeting the Android platform, currently the most widespread operating system for mobile devices. In this scenario antimalware technologies are not able to detect the so-called zero-day malware, because they are able to detect mobile malware only once their malicious signature is stored in the antimalware repository (i.e., the so-called signature based approach). From these considerations, in this paper an approach for detecting Android malware is proposed. Moreover the proposed approach aims to detect the belonging family of the malicious sample under analysis. We represent the executable of the application in term of audio file and, exploiting audio signal processing techniques, we extract a set of numerical features from each sample. Thus, we build several machine learning models and we evaluate their effectiveness in terms of malware detection and family identification. We experiment the method we propose on a data-set composed by 50,000 Android real-world samples (24,553 malicious among 71 families and 25,447 legitimate), by reaching an accuracy equal to 0.952 in Android malware detection and of 0.922 in family detection.

Audio signal processing for Android malware detection and family identification

Mercaldo F.;Santone A.
2021-01-01

Abstract

Mobile malware is increasing in complexity and maliciousness, with particular regard to the malicious samples targeting the Android platform, currently the most widespread operating system for mobile devices. In this scenario antimalware technologies are not able to detect the so-called zero-day malware, because they are able to detect mobile malware only once their malicious signature is stored in the antimalware repository (i.e., the so-called signature based approach). From these considerations, in this paper an approach for detecting Android malware is proposed. Moreover the proposed approach aims to detect the belonging family of the malicious sample under analysis. We represent the executable of the application in term of audio file and, exploiting audio signal processing techniques, we extract a set of numerical features from each sample. Thus, we build several machine learning models and we evaluate their effectiveness in terms of malware detection and family identification. We experiment the method we propose on a data-set composed by 50,000 Android real-world samples (24,553 malicious among 71 families and 25,447 legitimate), by reaching an accuracy equal to 0.952 in Android malware detection and of 0.922 in family detection.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11695/103180
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 8
social impact