Deteksi Tren Topik Penelitian Bidang Kecerdasan Buatan Menggunakan LDA dan BERTopic
Main Article Content
Abstract
Penelitian yang semakin berkembang dari tahun ke tahun membuat para peneliti kesulitan dalam memahami tren topik penelitian. Penelitian ini bertujuan mengetahui tren topik penelitian dengan otomatis tanpa harus membaca semua publikasi yang ada pada ICLR (The International Conference on Learning Representation) dari tahun 2019 hingga 2023 menggunakan pemodelan LDA (Latent Dirichlet Allocation) dan BERTopic. Total terdapat 14.613 data yang dikumpulkan dengan cara crawling. Pengolahan data melalui tahapan preprocessing, pembuatan corpus & dictionary untuk LDA, menerapkan pemodelan LDA dan BERTopic, dan menggunakan coherence sebagai evaluasi pemodelan. Hasil penelitian ini terbukti bahwa kedua pemodelan tersebut bisa mengidentifikasi tren topik penelitian, nilai coherence pada pemodelan BERTopic lebih tinggi dibandingkan LDA. Hal ini menunjukkan pemodelan BERTopic lebih baik dalam merepresentasikan keterkaitan setiap kata dalam suatu topik.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
[1] F. Jeka, R. Risnita, M. S. Jailani, and A. Asrulla, “Kajian Literatur dalam Menyusun Referensi Kunci, State Of The Art, dan Keterbaharuan Penelitian (Novelty),” J. Pendidik. Tambusai, vol. 7, Nov. 2023, doi: https://doi.org/10.31004/jptam.v7i3.10870.
[2] H. Jelodar et al., “Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey,” Dec. 05, 2018, arXiv: arXiv:1711.04305. Accessed: Sep. 25, 2024. [Online]. Available: http://arxiv.org/abs/1711.04305
[3] H. Axelborn and J. Berggren, “Topic Modeling for Customer Insights A Comparative Analysis of LDA and BERTopic in Categorizing Customer Calls,” Umeå University, 2023. [Online]. Available: https://www.bing.com/ck/a?!&&p=8320d47a58d34d242f3aef163a68428adc261b9250d9988e64f952a8216b4826JmltdHM9MTczNjU1MzYwMA&ptn=3&ver=2&hsh=4&fclid=120b6770-653b-6eae-1be8-77c8645e6f65&psq=Topic+Modeling+for+Customer+Insights+A+Comparative+Analysis+of+LDA+and+BERTopic+in+Categorizing+Customer+Calls&u=a1aHR0cHM6Ly91bXUuZGl2YS1wb3J0YWwub3JnL3NtYXNoL2dldC9kaXZhMjoxNzYzNjM3L0ZVTExURVhUMDEucGRm&ntb=1
[4] Y. Liu and F. Wan, “Unveiling temporal and spatial research trends in precision agriculture: A BERTopic text mining approach,” Heliyon, vol. 10, no. 17, p. e36808, Sep. 2024, doi: 10.1016/j.heliyon.2024.e36808.
[5] S. Basuki and N. M. Fadillah, “Implementasi Algoritma Topic Modeling pada Abstrak Paper Ilmiah untuk Deteksi Tren Topik Penelitian,” vol. 7, no. 1.
[6] S. Basuki, Y. Azhar, A. E. Minarno, C. S. K. Aditya, F. D. S. Sumadi, and A. I. Ramadhan, “Detection of Reference Topics and Suggestions using Latent Dirichlet Allocation (LDA),” in 2019 12th International Conference on Information & Communication Technology and System (ICTS), Surabaya, Indonesia: IEEE, Jul. 2019, pp. 1–5. doi: 10.1109/ICTS.2019.8850993.
[7] S. Zhou, P. Kan, Q. Huang, and J. Silbernagel, “A guided latent Dirichlet allocation approach to investigate real-time latent topics of Twitter data during Hurricane Laura,” J. Inf. Sci., vol. 49, no. 2, pp. 465–479, Apr. 2023, doi: 10.1177/01655515211007724.
[8] A. Ariansyah and U. Indahyanti, “Fitur Ekstraksi pada Pemodelan Topik Menggunakan Metode Latent Dirichlet Allocation pada Peristiwa Kebocoran Data,” vol. 1, no. 2, 2024.
[9] M. Sharipov and O. Sobirov, “Development of a rule-based lemmatization algorithm through Finite State Machine for Uzbek language,” 2022, arXiv. doi: 10.48550/ARXIV.2210.16006.
[10] S. H. Mohammed and S. Al-augby, “LSA & LDA topic modeling classification: comparison study on e-books,” Indones. J. Electr. Eng. Comput. Sci., vol. 19, no. 1, p. 353, Jul. 2020, doi: 10.11591/ijeecs.v19.i1.pp353-362.
[11] K. B. Vamshi, A. K. Pandey, and K. A. P. Siva, “Topic Model Based Opinion Mining and Sentiment Analysis,” in 2018 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore: IEEE, Jan. 2018, pp. 1–4. doi: 10.1109/ICCCI.2018.8441220.
[12] A. F. Pathan and C. Prakash, “Unsupervised Aspect Extraction Algorithm for opinion mining using topic modeling,” Glob. Transit. Proc., vol. 2, no. 2, pp. 492–499, Nov. 2021, doi: 10.1016/j.gltp.2021.08.005.
[13] D. Maulidiya, “Topic Modelling using Latent Dirichlet Allocation (LDA) to Investigate the Latent Topics of Mathematical Creative Thinking Research in Indonesia,” J. Intell. Comput. Health Inform., vol. 3, no. 2, p. 35, Feb. 2023, doi: 10.26714/jichi.v3i2.11428.
[14] N. L. P. M. Putu, Ahmad Zuli Amrullah, and Ismarmiaty, “Analisis Sentimen dan Pemodelan Topik Pariwisata Lombok Menggunakan Algoritma Naive Bayes dan Latent Dirichlet Allocation,” J. RESTI Rekayasa Sist. Dan Teknol. Inf., vol. 5, no. 1, pp. 123–131, Feb. 2021, doi: 10.29207/resti.v5i1.2587.
[15] M. Grootendorst, “BERTopic: Neural topic modeling with a class-based TF-IDF procedure,” Mar. 11, 2022, arXiv: arXiv:2203.05794. Accessed: Aug. 19, 2024. [Online]. Available: http://arxiv.org/abs/2203.05794