Classification Of Malware Families Using Naïve Bayes Classifier

Main Article Content

Ramadan Pratama
Denar Regata Akbi
Vinna Rahmayanti Setyaning Nastiti

Abstract

Dikarenakan peningkatan pengguna smartphone Android berbanding lurus dengan peningkatan pengembangan malware yang semakin pesat. Tidak jarang penelitian tentang malware setiap tahunnya yang membahas tentang malware families dengan berbagai macam pendekatan yang salah satunya machine learning. Dengan mendapatkan data malware yang kredibel, dapat memudahkan peneliti dalam menganalisa malware. Terdapat kumpulan data malware yang dibuat the Canadian Institute for Cybersecurity (CIC) yang dapat diakses secara publik. Data ini disebut CICInvestAndMal2019 yang berisi data malware. Dataset ini dibuat dengan melakukan analisa statis dan dinamis pada smartphone secara real time. Hasil dari analisa tersebut kemudian diproses dengan metode Random Forest yang menghasilkan precision 61.2% dan recall 57.7%. Berdasarkan penelitian tersebut, maka penulis akan mengklasifikasikan dataset CICInvestAndMal2019 menggunakan metode Naïve Bayes, dan hasil yang didapat dari klasifikasi Naïve Bayes adalah nilai recall dan precision sebesar 68% dan 66%.

Downloads

Download data is not yet available.

Article Details

How to Cite
[1]
R. Pratama, D. R. Akbi, and V. R. Setyaning Nastiti, “Classification Of Malware Families Using Naïve Bayes Classifier”, JR, vol. 3, no. 4, Feb. 2024.
Section
Articles

References

D. J. Wu, C. H. Mao, T. E. Wei, H. M. Lee, and K. P. Wu, “DroidMat: Android malware detection through manifest and API calls tracing,” Proc. 2012 7th Asia Jt. Conf. Inf. Secur. AsiaJCIS 2012, pp. 62–69, 2012, doi: 10.1109/AsiaJCIS.2012.18.

A. H. Lashkari, A. F. A. Kadir, L. Taheri, and A. A. Ghorbani, “Toward Developing a Systematic Approach to Generate Benchmark Android Malware Datasets and Classification,” Proc. - Int. Carnahan Conf. Secur. Technol., vol. 2018-Octob, no. Cic, pp. 1–7, 2018, doi: 10.1109/CCST.2018.8585560.

Z. Xu, K. Ren, and F. Song, “Android malware family classification and characterization using CFG and DFG,” Proc. - 2019 13th Int. Symp. Theor. Asp. Softw. Eng. TASE 2019, pp. 49–56, 2019, doi: 10.1109/TASE.2019.00-20.

M. A. Jerlin and K. Marimuthu, “A New Malware Detection System Using Machine Learning Techniques for API Call Sequences,” J. Appl. Secur. Res., vol. 13, no. 1, pp. 45–62, 2018, doi: 10.1080/19361610.2018.1387734.

L. Liu, B. sheng Wang, B. Yu, and Q. xi Zhong, “Automatic malware classification and new malware detection using machine learning,” Front. Inf. Technol. Electron. Eng., vol. 18, no. 9, pp. 1336–1347, 2017, doi: 10.1631/FITEE.1601325.

L. Massarelli, L. Aniello, C. Ciccotelli, L. Querzoni, D. Ucci, and R. Baldoni, “Android malware family classification based on resource consumption over time,” Proc. 2017 12th Int. Conf. Malicious Unwanted Software, MALWARE 2017, vol. 2018-Janua, pp. 31–38, 2018, doi: 10.1109/MALWARE.2017.8323954.

J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-An, and H. Ye, “Significant Permission Identification for Machine-Learning-Based Android Malware Detection,” IEEE Trans. Ind. Informatics, vol. 14, no. 7, pp. 3216–3225, 2018, doi: 10.1109/TII.2017.2789219.

A. R. Yogaswara, D. R. Akbi, V. Rahmayati, and S. Nastiti, “Malware Familiy Classification using k-Nearest Neighbor ( k-NN ),” vol. 3357, no. 1, pp. 1–5, 2020.

H. Zhang, C. T. Liu, J. Mao, C. Shen, R. L. Xie, and B. Mu, “Development of novel in silico prediction model for drug-induced ototoxicity by using naïve Bayes classifier approach,” Toxicol. Vitr., vol. 65, no. September 2019, 2020, doi: 10.1016/j.tiv.2020.104812.

L. Taheri, A. F. A. Kadir, and A. H. Lashkari, “Extensible android malware detection and family classification using network-flows and API-calls,” Proc. - Int. Carnahan Conf. Secur. Technol., vol. 2019-Octob, no. Cic, 2019, doi: 10.1109/CCST.2019.8888430.

P. Chandrasekar and K. Qian, “The Impact of Data Preprocessing on the Performance of a Naïve Bayes Classifier,” Proc. - Int. Comput. Softw. Appl. Conf., vol. 2, pp. 618–619, 2016, doi: 10.1109/COMPSAC.2016.205.

J. D. Chee, “Pearson’s Product-Moment Correlation: Sample Analysis,” ResearchGate, no. May 2015, 2016, doi: 10.13140/RG.2.1.1856.2726.

Z. Zakeri, N. Mansfield, C. Sunderland, and A. Omurtag, “Cross-validating models of continuous data from simulation and experiment by using linear regression and artificial neural networks,” Informatics Med. Unlocked, vol. 21, no. July, p. 100457, 2020, doi: 10.1016/j.imu.2020.100457.

H. Zhou, Z. Deng, Y. Xia, and M. Fu, “A new sampling method in particle filter based on Pearson correlation coefficient,” Neurocomputing, vol. 216, pp. 208–215, 2016, doi: 10.1016/j.neucom.2016.07.036.

J. D. Chee and T. Queen, “Pearson’s Product Moment Correlation: Sample Analysis,” ResearchGate, no. May 2015, 2016, doi: 10.13140/RG.2.1.1856.2726.

E. C. Blessie and E. Karthikeyan, “Sigmis: A feature selection algorithm using correlation based method,” J. Algorithms Comput. Technol., vol. 6, no. 3, pp. 385–394, 2012, doi: 10.1260/1748-3018.6.3.385.

S. Saud, B. Jamil, Y. Upadhyay, and K. Irshad, “Performance improvement of empirical models for estimation of global solar radiation in India: A k-fold cross-validation approach,” Sustain. Energy Technol. Assessments, vol. 40, no. June, p. 100768, 2020, doi: 10.1016/j.seta.2020.100768.

G. Jiang and W. Wang, “Error estimation based on variance analysis of k-fold cross-validation,” Pattern Recognit., vol. 69, pp. 94–106, 2017, doi: 10.1016/j.patcog.2017.03.025