https://aboutmusicschools.com https://slotmgc.com https://300thcombatengineersinwwii.com https://mobilephonesource.co.uk https://discord-servers.io https://esmark.net https://slotmgc.com https://nikeshoesinc.us https://ellisislandimmigrants.org https://holidaysanthology.com https://southaventownecenter.net https://jimgodfreydesign.com https://mckinneypaintingpros.com https://enchantedmansion.org https://mckinneypaintingpros.com https://laurabrodieauthor.com https://holidaysanthology.com https://ardictionary.com https://113.30.151.116 https://103.252.118.20 https://206.189.83.174 https://157.230.39.109 https://128.199.85.208 https://172.104.51.149 https://174.138.21.250 https://157.245.50.183 https://152.42.239.189 https://188.166.210.125 https://152.42.178.155 https://192.53.172.202 https://172.104.188.91 https://103.252.118.157 https://63.250.61.107 https://165.22.104.74

Classification of Public Opinion on Social Media Twitter concerning the Education in Indonesia Using the K-Nearest Neighbors (K-NN) Algorithm and K-Fold Cross Validation

Authors

  • Intan Monica Hanmastiana Diponegoro University
  • Budi Warsito Diponegoro University
  • Rita Rahmawati Diponegoro University
  • Hasbi Yasin Diponegoro University
  • Puspita Kartikasari Diponegoro University

DOI:

https://doi.org/10.29313/statistika.v21i2.297

Keywords:

education, sentiment analysis, twitter, k-nearest neighbors, text mining, K-NN

Abstract

Developing country is a country that has perspective and idea which reflect its awareness of the importance of advancing the education sector. Assessment of the quality of education in Indonesia from the perspective of the community gets different responses. Therefore, it makes people respond differently. The community response is often found on social media, one of which is Twitter. Twitter is one of the application service that is popular due to its uses to interact and communicate with people in daily life. The sentiment analysis on Twitter can be a choice to see the community’s responses to the condition of education in Indonesia. The responses are classified into positive sentiments and negative sentiments using the K-Nearest Neighbors (K-NN) algorithm with a 10-fold cross validation model evaluation. K-NN has several advantages, they are fast training, simple, easy to learn, resistance toward training data which has noise, and effective if the training data is large. In this study, the sentiment classification uses Cosine Similarity distance measurement and four k value parameters which are 3, 5, 7, and 9. Data labelling is done manually and done by scoring sentiment. Visualization of positive and negative sentiments use Word Cloud. The test results show that public sentiment about education tends to be positive on Twitter and the parameter k = 7 obtained the highest accuracy value in data labelling that was done manually and done by scoring sentiment. In labelling data manually, it obtained an accuracy of 76.93% whereas, in labelling the data with scoring sentiment, it obtained an accuracy of 77.87%. Sentiment analysis is made using the RStudio programming language as the support software.

References

Hearst, M. (2003). What Is Text Mining?

Larose, D. T., & Larose, C. D. (2014). Discovering Knowledge in Data. In Discovering Knowledge in Data. https://doi.org/10.1002/9781118874059

Liu, B. (2012). Sentiment Analysis and Opinion Mining. Http://Dx.Doi.Org/ 10.2200/S00416ED1V01Y201204HLT016 , 5(1), 1–184. https://doi.org/10.2200/S00416ED1V01Y201204HLT016

Putrianti, R. P., Kurniati, A., & Agustin, D. (2019). Implementasi Algoritma K - Nearest Neighbor Terhadap Analisis Sentimen Review Restoran Dengan Teks Bahasa Indonesia. Seminar Nasional Aplikasi Teknologi Informasi (SNATI), 0(0). https://journal.uii.ac.id/Snati/article/view/13397

Teljstedt, E. C. (2016). Separating Tweets from Croaks (Detecting Automated Twitter Accounts with Supervised Learning and Synthetically Constructed Training Data ). www.kth.se/csc

Wu, X., Kumar, V., Ross, Q. J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Yu, P. S., Zhou, Z. H., Steinbach, M., Hand, D. J., & Steinberg, D. (2008). Top 10 algorithms in data mining. In Knowledge and Information Systems (Vol. 14, Issue 1). https://doi.org/10.1007/s10115-007-0114-2

Downloads

Published

2022-01-30

Issue

Section

Articles