Toxicity and topic analysis of travel vlog content in digital era: perspective and multilingual embedding model (voyage-multilingual-2)
DOI:
https://doi.org/10.35335/cit.Vol16.2024.844.pp199-210Keywords:
Travel vlog content, Digital communication, Clustering techniques, Content moderation, Toxicity analysisAbstract
This research investigates the complexities of online discourse by conducting a detailed toxicity and topic analysis of travel vlog content on user-generated platforms. By analyzing 1,503 posts using the Perspective API, the study finds generally low levels of toxicity, with an average toxicity score of 0.06995 and a peak of 0.78207, and similarly low average scores for severe toxicity, identity attack, insult, profanity, and threat (0.00654, 0.01237, 0.03778, 0.06241, and 0.01186, respectively). However, the highest recorded values for these measures—0.45895 for severe toxicity, 0.69287 for identity attack, 0.63084 for insult, 0.81864 for profanity, and 0.51957 for threat—highlight the sporadic presence of harmful content. Advanced clustering techniques, such as HDBScan, k-Means, and Gaussian Mixture models, enable a comprehensive examination of thematic diversity and sentiment distribution within the comments, offering valuable insights into audience engagement and perception. These findings underline the critical need for compelling content moderation and community management strategies to mitigate toxic behaviors and promote a positive digital environment. The study concludes that as digital media evolves, further research into toxicity, thematic content, and user engagement is essential for enhancing theoretical frameworks and practical applications in digital communication.
Downloads
References
N. Aslam, F. Rustam, E. Lee, P. B. Washington, and I. Ashraf, "Sentiment Analysis and Emotion Detection on Cryptocurrency Related Tweets Using Ensemble LSTM-GRU Model," IEEE Access, vol. 10, pp. 39313–39324, 2022, doi: 10.1109/ACCESS.2022.3165621.
N. Basaraba, "The rise of paranormal investigations as virtual dark tourism on YouTube," J. Herit. Tour., vol. 19, no. 2, pp. 287–309, 2024, doi: 10.1080/1743873X.2023.2268746.
A. Boumhidi, A. Benlahbib, and E. H. Nfaoui, "Cross-Platform Reputation Generation System Based on Aspect-Based Sentiment Analysis," IEEE Access, vol. 10, pp. 2515–2531, 2022, doi: 10.1109/ACCESS.2021.3139956.
S. Seo, S. Na, and J. Kim, "HMTL: Heterogeneous Modality Transfer Learning for Audio-Visual Sentiment Analysis," IEEE Access, vol. 8, pp. 140426–140437, 2020, doi: 10.1109/ACCESS.2020.3006563.
A. Altaf et al., "Deep Learning Based Cross Domain Sentiment Classification for Urdu Language," IEEE Access, vol. 10, no. September, pp. 102135–102147, 2022, doi: 10.1109/ACCESS.2022.3208164.
F. Mehraliyev, I. C. C. Chan, and A. P. Kirilenko, "Sentiment analysis in hospitality and tourism: a thematic and methodological review," Int. J. Contemp. Hosp. Manag., vol. 34, no. 1, pp. 46–77, 2022, doi: 10.1108/IJCHM-02-2021-0132.
Y. A. Singgalen, "Culture and heritage tourism sentiment classification through cross-industry standard process for data mining," Int. J. Basic Appl. Sci., vol. 12, no. 3, pp. 110–120, 2023.
M. Sohi, M. Pitesky, and J. Gendreau, "Analyzing public sentiment toward GMOs via social media between 2019-2021," GM Crop. Food, vol. 14, no. 1, pp. 1–9, 2023, doi: 10.1080/21645698.2023.2190294.
R. Bringula, S. A. I. D. A. Ulfa, J. P. P. Miranda, and F. A. L. Atienza, "Text mining analysis on students' expectations and anxieties towards data analytics course," Cogent Eng., vol. 9, no. 1, 2022, doi: 10.1080/23311916.2022.2127469.
A. Karlsson, E. T. Bekar, and A. Skoogh, "Multi-machine Gaussian topic modeling for predictive maintenance," IEEE Access, vol. 9, pp. 100063–100080, 2021, doi: 10.1109/ACCESS.2021.3096387.
C. Zhou, H. Ban, J. Zhang, Q. Li, and Y. Zhang, "Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling," IEEE Access, vol. 8, pp. 106843–106854, 2020, doi: 10.1109/ACCESS.2020.3001184.
R. A. Vandermeulen and R. Saitenmacher, "Generalized Identifiability Bounds for Mixture Models With Grouped Samples," IEEE Trans. Inf. Theory, vol. 70, no. 4, pp. 2746–2758, 2024, doi: 10.1109/TIT.2024.3367433.
B. Yang, Z. Jia, J. Yang, and N. K. Kasabov, "Video Snow Removal Based on Self-Adaptation Snow Detection and Patch-Based Gaussian Mixture Model," IEEE Access, vol. 8, pp. 160188–160201, 2020, doi: 10.1109/ACCESS.2020.3020619.
K. L. Tan, C. P. Lee, K. S. M. Anbananthen, and K. M. Lim, "RoBERTa-LSTM: A Hybrid Model for Sentiment Analysis With Transformer and Recurrent Neural Network," IEEE Access, vol. 10, pp. 21517–21525, 2022, doi: 10.1109/ACCESS.2022.3152828.
S. Al-Azani and E. S. M. El-Alfy, "Enhanced Video Analytics for Sentiment Analysis Based on Fusing Textual, Auditory and Visual Information," IEEE Access, vol. 8, pp. 136843–136857, 2020, doi: 10.1109/ACCESS.2020.3011977.
M. Y. Khan, T. Ahmed, M. S. Siddiqui, and S. Wasi, "Cognitive Relationship-Based Approach for Urdu Sarcasm and Sentiment Classification," IEEE Access, vol. 11, no. November, pp. 126661–126690, 2023, doi: 10.1109/ACCESS.2023.3325048.
J. Sun, P. Han, Z. Cheng, E. Wu, and W. Wang, "Transformer Based Multi-Grained Attention Network for Aspect-Based Sentiment Analysis," IEEE Access, vol. 8, pp. 211152–211163, 2020, doi: 10.1109/ACCESS.2020.3039470.
H. Zhang et al., "Multidimensional Extra Evidence Mining for Image Sentiment Analysis," IEEE Access, vol. 8, no. 1, pp. 103619–103634, 2020, doi: 10.1109/ACCESS.2020.2999128.
M. F. R. Abu Bakar, N. Idris, L. Shuib, and N. Khamis, "Sentiment Analysis of Noisy Malay Text: State of Art, Challenges and Future Work," IEEE Access, vol. 8, pp. 24687–24696, 2020, doi: 10.1109/ACCESS.2020.2968955.
K. Mishev, A. Gjorgjevikj, I. Vodenska, L. T. Chitkushev, and D. Trajanov, "Evaluation of Sentiment Analysis in Finance: From Lexicons to Transformers," IEEE Access, vol. 8, pp. 131662–131682, 2020, doi: 10.1109/ACCESS.2020.3009626.
G. M. Shafiq, T. Hamza, M. F. Alrahmawy, and R. El-Deeb, "Enhancing Arabic Aspect-Based Sentiment Analysis Using End-to-End Model," IEEE Access, vol. 11, no. November, pp. 142062–142076, 2023, doi: 10.1109/ACCESS.2023.3342755.
M. Bibi, W. Aziz, M. Almaraashi, I. H. Khan, M. S. A. Nadeem, and N. Habib, "A Cooperative Binary-Clustering Framework Based on Majority Voting for Twitter Sentiment Analysis," IEEE Access, vol. 8, pp. 68580–68592, 2020, doi: 10.1109/ACCESS.2020.2983859.
Z. Kastrati, A. S. Imran, S. M. Daudpota, M. A. Memon, and M. Kastrati, "Soaring Energy Prices: Understanding Public Engagement on Twitter Using Sentiment Analysis and Topic Modeling with Transformers," IEEE Access, vol. 11, no. February, pp. 26541–26553, 2023, doi: 10.1109/ACCESS.2023.3257283.
S. M. A. H. Shah, S. F. H. Shah, A. Ullah, A. Rizwan, G. Atteia, and M. Alabdulhafith, "Arabic Sentiment Analysis and Sarcasm Detection Using Probabilistic Projections-Based Variational Switch Transformer," IEEE Access, vol. 11, no. June, pp. 67865–67881, 2023, doi: 10.1109/ACCESS.2023.3289715.
E. Elbasani and J. D. Kim, "AMR-CNN: Abstract Meaning Representation with Convolution Neural Network for Toxic Content Detection," J. Web Eng., vol. 21, no. 3, pp. 677–692, 2022, doi: 10.13052/jwe1540-9589.2135.
Y. A. Singgalen, "Toxicity Analysis and Sentiment Classification of Wonderland Indonesia by Alffy Rev using Support Vector Machine," J. Sist. Komput. dan Inform., vol. 5, no. 3, pp. 538–548, 2024, doi: 10.30865/json.v5i3.7563.
Y. A. Singgalen, "Sentiment Classification of Food Influencer Content Reviews using Support Vector Machine Model through CRISP-DM Framework," J. Sist. Komput. dan Inform., vol. 5, no. 3, pp. 517–528, 2024, doi: 10.30865/json.v5i3.7509.
G. Etta, M. Cinelli, N. Di Marco, M. Avalle, A. Panconesi, and W. Quattrociocchi, "A Topology-Based Approach for Predicting Toxic Outcomes on Twitter and YouTube," IEEE Trans. Netw. Sci. Eng., vol. 11, no. 5, pp. 4875–4885, 2024, doi: 10.1109/TNSE.2024.3398219.
V. Ganganwar and R. Rajalakshmi, "Employing synthetic data for addressing the class imbalance in aspect-based sentiment classification," J. Inf. Telecommun., pp. 1–22, 2023, doi: 10.1080/24751839.2023.2270824.
Q. Song and A. Wondirad, "Demystifying the nexus between social media usage and overtourism: evidence from Hangzhou, China," Asia Pacific J. Tour. Res., vol. 28, no. 4, pp. 364–385, 2023, doi: 10.1080/10941665.2023.2230313.
Q. Lu, Z. Zhu, D. Zhang, W. Wu, and Q. Guo, "Interactive Rule Attention Network for Aspect-Level Sentiment Analysis," IEEE Access, vol. 8, pp. 52505–52516, 2020, doi: 10.1109/ACCESS.2020.2981139.
D. Flonk, M. Jachtenfuchs, and A. Obendiek, "Controlling internet content in the EU: towards digital sovereignty," J. Eur. Public Policy, vol. 31, no. 8, pp. 2316–2342, 2024, doi: 10.1080/13501763.2024.2309179.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Yerik Afrianto Singgalen

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

