Sentiment Analysis of Hate Speech against DPR-RI on Twitter Using Naive Bayes and KNN Algorithms

Authors

  • Joy Lousia Brigitha Munthe Politeknik Negeri Medan
  • Kristin Sinaga Politeknik Negeri Medan
  • Santi Prayudani Politeknik Negeri Medan

DOI:

https://doi.org/10.62123/enigma.v2i1.39

Keywords:

Twitter, Sentiment, K-Nearest Neighbours, Crawling, Naïve Bayes Classification

Abstract

According to surveys, the rise in social media users has resulted in an increase of hate speech, and Twitter is one of the most popular platforms for this type of speech. The tweet feature on Twitter enables users to make repeated instances of hate speech, making Twitter data very intriguing to analyze. This study aims to investigate whether a tweet contains hate speech towards the Indonesian House of Representatives (DPR-RI). The research employed crawling techniques to gather data from Twitter using the Twitter API feature. The Naïve Bayes algorithm was applied, and the results were compared with the accuracy of the K-Nearest Neighbor. After preprocessing, the total data obtained was 1,494, with 956 test data and 538 training data. The study revealed that Twitter users' sentiment towards DPR-RI was 49.2% positive and 50.8% negative sentiment when tested using Naïve Bayes. Meanwhile, KNN showed 23.4% positive and 76.6% negative sentiment. The high negative sentiment in both classifiers suggests that Twitter users frequently express hate speech towards DPR-RI. Naïve Bayes algorithm showed the highest prediction accuracy at 98.32%, while the K-Nearest Neighbor algorithm had an accuracy of only 62.84%.

Downloads

Download data is not yet available.

References

B. Auxier and M. Anderson, “Social Media Use in 2021.” Accessed: Oct. 06, 2024. [Online]. Available: https://www.pewresearch.org/

K. S. Nugroho, F. A. Bachtiar, and W. F. Mahmudy, “Detecting Emotion in Indonesian Tweets: A Term-Weighting Scheme Study,” Journal of Information Systems Engineering and Business Intelligence, vol. 8, no. 1, pp. 61–70, Apr. 2022, doi: 10.20473/jisebi.8.1.61-70.

Z. Mansur, N. Omar, and S. Tiun, “Twitter Hate Speech Detection: A Systematic Review of Methods, Taxonomy Analysis, Challenges, and Opportunities,” 2023, Institute of Electrical and Electronics Engineers Inc. doi: 10.1109/ACCESS.2023.3239375.

A. F. Hidayatullah, S. Cahyaningtyas, and A. M. Hakim, “Sentiment Analysis on Twitter using Neural Network: Indonesian Presidential Election 2019 Dataset,” IOP Conf Ser Mater Sci Eng, vol. 1077, no. 1, p. 012001, Feb. 2021, doi: 10.1088/1757-899x/1077/1/012001.

S. Pandya and P. Mehta, “A Review On Sentiment Analysis Methodologies, Practices And Applications,” INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH, vol. 9, p. 2, 2020, [Online]. Available: www.ijstr.org

R. Marcec and R. Likic, “Using Twitter for sentiment analysis towards AstraZeneca/Oxford, Pfizer/BioNTech and Moderna COVID-19 vaccines,” Postgrad Med J, vol. 98, no. 1161, pp. 544–550, Jul. 2022, doi: 10.1136/postgradmedj-2021-140685.

P. Bist and A. Prambudi, “Implementation Of Data Mining On Glasses Sales Using The Apriori Algorithm,” International Journal of Cyber and IT Service Management (IJCITSM), vol. 1, no. 2, pp. 159–172, 2021, doi: 10.34306/ijcitsm.v1i1.46.

T. H. Sinaga, A. Wanto, I. Gunawan, S. Sumarno, and Z. M. Nasution, “Implementation of Data Mining Using C4.5 Algorithm on Customer Satisfaction in Tirta Lihou PDAM,” Journal of Computer Networks, Architecture, and High-Performance Computing, vol. 3, no. 1, pp. 9–20, Jan. 2021, doi: 10.47709/cnahpc.v3i1.923.

H. 1, T. Wahyuningsih , and E. Rahwanto, “Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer.” [Online]. Available: http://archive.ics.uci.edu/ml.

H. Wisnu, M. Afif, and Y. Ruldevyani, “Sentiment analysis on customer satisfaction of digital payment in Indonesia: A comparative study using KNN and Naïve Bayes,” in Journal of Physics: Conference Series, Institute of Physics Publishing, Feb. 2020. doi: 10.1088/1742-6596/1444/1/012034.

F. M. J. M. Shamrat et al., “Sentiment analysis on twitter tweets about COVID-19 vaccines using NLP and supervised KNN classification algorithm,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 23, no. 1, pp. 463–470, Jul. 2021, doi: 10.11591/ijeecs.v23.i1.pp463-470.

I. Prayoga, M. D. Purbolaksono, and A. Adiwijaya, “Sentiment Analysis on Indonesian Movie Review Using KNN Method With the Implementation of Chi-Square Feature Selection,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 7, no. 1, p. 369, Jan. 2023, doi: 10.30865/mib.v7i1.5522.

Z. Rais, R. N. Said, and R. Ruliana, “Text Classification on Sentiment Analysis of Marketplace SHOPEE Reviews On Twitter Using K-Nearest Neighbor (KNN) Method,” JINAV: Journal of Information and Visualization, vol. 3, no. 1, pp. 1–8, Jul. 2022, doi: 10.35877/454ri.jinav1389.

A. R. Lubis, M. K. M. Nasution, O. S. Sitompul, and E. M. Zamzami, “The feature extraction for classifying words on social media with the Naïve Bayes algorithm,” IAES International Journal of Artificial Intelligence, vol. 11, no. 3, pp. 1041–1048, Sep. 2022, doi: 10.11591/ijai.v11.i3.pp1041-1048.

Samsir et al., “Naives Bayes Algorithm for Twitter Sentiment Analysis,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Jun. 2021. doi: 10.1088/1742-6596/1933/1/012019.

M. H. Santoso, “Application of Association Rule Method Using Apriori Algorithm to Find Sales Patterns Case Study of Indomaret Tanjung Anom,” Brilliance: Research of Artificial Intelligence, vol. 1, no. 2, pp. 54–66, Dec. 2021, doi: 10.47709/brilliance.v1i2.1228.

A. A. Aldino, D. Darwis, A. T. Prastowo, and C. Sujana, “Implementation of K-Means Algorithm for Clustering Corn Planting Feasibility Area in South Lampung Regency,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Jan. 2021. doi: 10.1088/1742-6596/1751/1/012038.

M. Uska, R. Wirasasmita, U. Usuluddin, and B. Arianti, “Evaluation of Rapidminer-Aplication in Data Mining Learning using PeRSIVA Model,” Edumatic: Jurnal Pendidikan Informatika, vol. 4, no. 2, pp. 164–171, Dec. 2020, doi: 10.29408/edumatic.v4i2.2688.

N. Baharun, N. F. M. Razi, S. Masrom, N. A. M. Yusri, and A. S. A. Rahman, “Auto Modellingfor Machine Learning: A Comparison Implementation between Rapid Miner and Python,” International Journal of Emerging Technology and Advanced Engineering, vol. 12, no. 5, pp. 15–27, May 2022, doi: 10.46338/ijetae0522_03.

S. Kurniawan, W. Gata, D. A. Puspitawati, I. K. S. Parthama, H. Setiawan, and S. Hartini, “Text Mining Pre-Processing Using Gata Framework and RapidMiner for Indonesian Sentiment Analysis,” in IOP Conference Series: Materials Science and Engineering, Institute of Physics Publishing, May 2020. doi: 10.1088/1757-899X/835/1/012057.

T. D. Dikiyanti, A. M. Rukmi, and M. I. Irawan, “Sentiment analysis and topic modeling of BPJS Kesehatan based on twitter crawling data using Indonesian Sentiment Lexicon and Latent Dirichlet Allocation algorithm,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Mar. 2021. doi: 10.1088/1742-6596/1821/1/012054.

Y. Deta Kirana and S. Al Faraby, “Sentiment Analysis of Beauty Product Reviews Using the K-Nearest Neighbor (KNN) and TF-IDF Methods with Chi-Square Feature Selection,” OPEN ACCESS J DATA SCI APPL, vol. 4, no. 1, pp. 31–042, 2021, doi: 10.34818/JDSA.2021.4.71.

Downloads

Published

2024-10-30

How to Cite

Munthe, J. L. B., Sinaga, K., & Santi Prayudani. (2024). Sentiment Analysis of Hate Speech against DPR-RI on Twitter Using Naive Bayes and KNN Algorithms. Electronic Integrated Computer Algorithm Journal, 2(1), 52–60. https://doi.org/10.62123/enigma.v2i1.39