Implementasi Algoritma Naïve Bayes Classifier Berbasis Particle Swarm Optimization (PSO) Untuk Klasifikasi Konten Berita Digital Bahasa Indonesia

Acmad Nurhadi - BSI Pontianak

Abstract


Abstract - A lot of important information is stored in the document word, and have each topic, then text classification is one solution to manage the information that is growing rapidly and the abundant, and already many agencies engaged in the distribution of information or news already started using web-based systems to deliver up to date news. However, the news divide into these categories for now still dilakukkan manually, so it is very troublesome and can also take a long time. In this study will be used merging feature selection methods, namely Particle Swarm Optimization based Naïve Bayes classifier to look at the accuracy of the method. This research has resulted in the form of text classification category of gossip, culinary, and travel from digital news content. Measurement is based on Naïve Bayes classifier accuracy before and after the addition of feature selection methods. The evaluation was done using a 10 fold cross validation. While the measurement accuracy is measured by confusion matrix. The results of this study obtained accuracy by using Naïve Bayes classifier algorithm method amounted to 94.17%.

Keywords: Particle Swarm Optimization, Naïve Bayes classifier, classification News Content, Text Mining

 

Abstrak - Banyak informasi penting yang tersimpan didalam dokumen berita, dan mempunyai topik masing-masing, kemudian klasifikasi teks merupakan salah satu solusi untuk mengelola informasi yang berkembang pesat dan melimpah tersebut, serta sudah banyak juga instansi yang bergerak dalam penyaluran informasi atau berita sudah mulai menggunakan sistem berbasis web untuk menyampaikan berita secara up to date. Namun, dalam membagi berita ke dalam kategori-kategori tersebut untuk saat ini masih dilakukkan secara manual, sehingga sangat merepotkan dan juga dapat memakan waktu yang lama. Dalam penelitian ini akan digunakan penggabungan metode pemilihan fitur, yaitu Particle Swarm Optimization berbasis Naïve Bayes Classifier untuk melihat akurasi pada metode tersebut. Penelitian ini menghasilkan klasifikasi teks dalam bentuk kategori gosip, kuliner, dan travel dari konten berita digital. Pengukuran berdasarkan akurasi Naïve Bayes Classifier sebelum dan sesudah penambahan metode pemilihan fitur. Evaluasi dilakukan menggunakan 10 fold cross validation. Sedangkan pengukuran akurasi diukur dengan confusion matrix. Hasil penelitian ini didapat akurasi dengan menggunakan metode algoritma Naïve Bayes Classifier sebesar 94.17%.

Kata kunci : Particle Swarm Optimization, Naïve Bayes Classifier, Klasifikasi Konten Berita, Text Mining


Full Text:

Untitled

References


Basari, A. S. H., Hussin, B., Ananta, I. G. P., & Zeniarja, J. (2013). Opinion mining of movie review using hybrid method of support vector machine and particle swarm optimization. Procedia Engineering, 53, 453–462.

Chandra, D. N., Indrawan, G., & Sukajaya, I. N. (2016). Klasifikasi Berita Lokal Radar Malang Menggunakan Metode Naïve Bayes Dengan Fitur N-Gram.

Cios, K. J., Pedrycz, W., Swiniarski, R. W., & Kurgan, L. A. (2007). Data Mining

A Knowledge Discovery Approach. Springer.

Efendi, R., & Malik, R. F. (2012). Klasifikasi dokumen berbahasa indonesia menggunakan naive bayes classifier.

Feldman, R., & Sanger, J. (2007). The Text Mining Handbook Advanced

Approaches in Analyzing Unstructured Data. Cambridge: Cambridge University Press.

Gorunescu, F. (2011). Data Mining Concept Model Technique.

Han, J., & Kamber, M. (2006). Data Mining: Concepts and Techniques. Soft Computing (Vol. 54).

Haupt, S. E. (2004). Practical Genetic Algorithms.

Kaizhu Huang, Haiqin Yang, Irwin King, M. L. (2008). Advanced Topics in Science and Technology in China.

Kaur, H. (2013). Online News Classification : A Review, 7–9.

Lee, C.-H., & Yang, H.-C. (2009). Construction of supervised and unsupervised learning systems for multilingual text categorization. Expert Systems with Applications, 36(2), 2400–2410.

Lin, S.-W., Ying, K.-C., Chen, S.-C., & Lee, Z.-J. (2008). Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Systems with Applications, 35(4), 1817–1824.

Liu, Y., Wang, G., Chen, H., & Dong, H. (2011). An improved particle swarm optimization for feature selection. Journal of Bionic …, 8, 392–397.

Mahinovs, A. Tiwarigton, a. (2007). Text Classification Method Review. Decision Engineering Report Series, (April).

Maimon, O. (2010). Data Mining And Knowledge Discovery Handbook. New York Dordrecht Heidelberg London: Springer.

Manning, C. D., Ragnavan, P., & Schutze, H. (2008). An Introduction to Information Retreival. IEEE Photonics Technology Letters, 21(8), C3–C3.

Mooney, J. (2006). Machine Learning Text Categorization. Austin: University of Texas.

Polettini, N. (2004). The Vector Space Model in Information Retrieval - Term Weighting Problem. 1-9.

Samodra, J., Sumpeno, S., & Hariadi, M. (2009). Klasifikasi Dokumen Teks Berbahasa Indonesia dengan Menggunakan Naïve Bayes.

Schneider, Karl-Michael. (2005). Techniques for Improving the Performance of Naive Bayes for Text Classification. In Proceedings of CICLing, pages 682-693.

Sebastiani, F. (2001). Machine Learning in Automated Text Categorization.


Refbacks

  • There are currently no refbacks.