Toxicity, social network and topic analysis of digital content: Perspective and multilingual embedding model
DOI:
https://doi.org/10.35335/cit.Vol16.2024.845.pp115-128Keywords:
Digital content analysis, Toxicity score, Clustering techniques, Social Network Analysis (SNA), Online community engagementAbstract
This research presents a comprehensive approach to analyzing digital content by integrating toxicity analysis, clustering techniques, and Social Network Analysis (SNA) to understand online interactions better. The study finds that, while the average toxicity levels are relatively low, with scores such as 0.06355 for toxicity and 0.00468 for severe toxicity, there are significant spikes, reaching maximum scores of 0.82996 for toxicity and 0.89494 for profanity. These spikes highlight the necessity for continuous monitoring and adaptive moderation strategies to minimize the impact of harmful language. Clustering methods, including K-Means, HDBScan, and Gaussian Mixture models, provide deep insights into the thematic structure of viewer discourse, identifying both prevalent and niche topics. The Gaussian Mixture model identified ten distinct clusters, while HDBScan revealed varying cluster densities, reflecting the diverse range of discussions within the community. In addition, SNA, with 1,716 nodes and 37 edges, offers critical insights into the relational dynamics of the network, pinpointing key influencers and mapping the flow of information between different user groups. By synthesizing these methodologies, the research provides a robust framework for understanding the content and context of digital interactions, facilitating more effective strategies for enhancing community engagement, mitigating toxicity, and promoting a healthier, more inclusive online environment.
Downloads
References
E. Surucu-Balci and G. Balci, “Building social capital in cruise travel via social network sites,” Curr. Issues Tour., vol. 26, no. 7, pp. 1096–1111, 2023, doi: 10.1080/13683500.2022.2047904.
E. Rosamond, “YouTube personalities as infrastructure: assets, attention choreographies and cohortification processes,” Distinktion, vol. 24, no. 2, pp. 254–282, 2023, doi: 10.1080/1600910X.2023.2185873.
E. King, “Gaming race in Brazil: Video games and algorithmic racism,” J. Lat. Am. Cult. Stud., vol. 33, no. 1, pp. 149–165, 2024, doi: 10.1080/13569325.2024.2307540.
D. Tauro, U. Panniello, and R. Pellegrino, “Risk Management in Digital Advertising: An Analysis from the Advertisers’ Media Management Perspective,” JMM Int. J. Media Manag., vol. 23, no. 1–2, pp. 29–57, 2021, doi: 10.1080/14241277.2021.1960532.
V. Blumenthal, M. Lerfald, and K. Blekastad Sagheim, “‘Hotels are much easier’: motivation for non-participation in travel-related sharing economy exchanges,” Curr. Issues Tour., pp. 1–17, 2024, doi: 10.1080/13683500.2024.2309158.
N. Dens and K. Poels, “The rise, growth, and future of branded content in the digital media landscape,” Int. J. Advert., vol. 42, no. 1, pp. 141–150, 2023, doi: 10.1080/02650487.2022.2157162.
A. Margherita, M. Nasiri, and T. Papadopoulos, “The application of digital technologies in company responses to COVID-19: an integrative framework,” Technol. Anal. Strateg. Manag., vol. 35, no. 8, pp. 979–992, 2023, doi: 10.1080/09537325.2021.1990255.
M. KhosraviNik and M. Amer, “Social media and terrorism discourse: the Islamic State’s (IS) social media discursive content and practices,” Crit. Discourse Stud., vol. 19, no. 2, pp. 124–143, 2022, doi: 10.1080/17405904.2020.1835684.
E. R. Kovacs, L. A. Cotfas, and C. Delcea, “January 6th on Twitter: measuring social media attitudes towards the Capitol riot through unhealthy online conversation and sentiment analysis,” J. Inf. Telecommun., vol. 8, no. 1, pp. 108–129, 2024, doi: 10.1080/24751839.2023.2262067.
Y. A. Singgalen, “Toxicity Analysis and Sentiment Classification of Wonderland Indonesia by Alffy Rev using Support Vector Machine,” J. Sist. Komput. dan Inform., vol. 5, no. 3, pp. 538–548, 2024, doi: 10.30865/json.v5i3.7563.
Y. A. Singgalen, “Implementation of Global Vectors for Word Representation ( GloVe ) Model and Social Network Analysis through Wonderland Indonesia Content Reviews,” J. Sist. Komput. dan Inform., vol. 5, no. 3, pp. 559–569, 2024, doi: 10.30865/json.v5i3.7569.
C. Budak, R. K. Garrett, and D. Sude, “Better Crowdcoding: Strategies for Promoting Accuracy in Crowdsourced Content Analysis,” Commun. Methods Meas., vol. 15, no. 2, pp. 141–155, 2021, doi: 10.1080/19312458.2021.1895977.
A. He and M. Abisado, “Text Sentiment Analysis of Douban Film Short Comments Based on BERT-CNN-BiLSTM-Att Model,” IEEE Access, vol. 12, no. March, pp. 45229–45237, 2024, doi: 10.1109/ACCESS.2024.3381515.
S. Sen Zhang, X. Liang, Y. D. Wei, and X. Zhang, “On Structural Features, User Social Behavior, and Kinship Discrimination in Communication Social Networks,” IEEE Trans. Comput. Soc. Syst., vol. 7, no. 2, pp. 425–436, 2020, doi: 10.1109/TCSS.2019.2962231.
J. Pueyo-Ros and E. Garau, “Do I have time to build the ark calmly? Characterizing attitudes towards climate change via sentiment analysis of social media,” J. Integr. Environ. Sci., vol. 20, no. 1, 2023, doi: 10.1080/1943815X.2023.2264380.
Y. A. Singgalen, “Social Network Analysis and Sentiment Classification of Extended Reality Product Content,” KLIK Kaji. Ilm. Inform. dan Komput., vol. 4, no. 4, pp. 2197–2208, 2024, doi: 10.30865/klik.v4i4.1712.
F. K. Sufi and M. Alsulami, “Automated Multidimensional Analysis of Global Events with Entity Detection, Sentiment Analysis and Anomaly Detection,” IEEE Access, vol. 9, pp. 152449–152460, 2021, doi: 10.1109/ACCESS.2021.3127571.
N. Jacob and V. M. Viswanatham, “Sentiment Analysis Using Improved Atom Search Optimizer With a Simulated Annealing and ReLU Based Gated Recurrent Unit,” IEEE Access, vol. 12, no. March, pp. 38944–38956, 2024, doi: 10.1109/ACCESS.2024.3375119.
C. L. Serban, A. M. Banu, S. Putnoky, S. I. Butica, M. D. Niculescu, and S. Putnoky, “Relative Validation of a Four Weeks Retrospective Food Frequency Questionnaire versus 7-Day Paper-Based Food Records in Estimating the Intake of Energy and Nutrients in Adults,” Nutr. Diet. Suppl., vol. Volume 13, pp. 113–125, 2021, doi: 10.2147/nds.s310260.
H. F. Gholipour, R. Tajaddini, and B. Foroughi, “International tourists’ spending on traveling inside a destination: does local happiness matter?,” Curr. Issues Tour., vol. 26, no. 12, pp. 2027–2043, 2023, doi: 10.1080/13683500.2022.2077178.
H. Ardiyanti, B. S. Laksmono, and D. Walujo, “Shifting biculturality to monoculturality?: the acculturation among Chinese Peranakans in Serui Regency of Papua , Indonesia,” Cogent Soc. Sci., vol. 10, no. 1, p., 2024, doi: 10.1080/23311886.2024.2359012.
A. Kayumov, Y. joo Ahn, K. Kiatkawsin, I. Sutherland, and S. Zielinski, “Service quality and customer loyalty in halal ethnic restaurants amid the COVID-19 pandemic: a study of halal Uzbekistan restaurants in South Korea,” Cogent Soc. Sci., vol. 10, no. 1, p., 2024, doi: 10.1080/23311886.2024.2301814.
G. M. Shafiq, T. Hamza, M. F. Alrahmawy, and R. El-Deeb, “Enhancing Arabic Aspect-Based Sentiment Analysis Using End-to-End Model,” IEEE Access, vol. 11, no. November, pp. 142062–142076, 2023, doi: 10.1109/ACCESS.2023.3342755.
A. Boumhidi, A. Benlahbib, and E. H. Nfaoui, “Cross-Platform Reputation Generation System Based on Aspect-Based Sentiment Analysis,” IEEE Access, vol. 10, pp. 2515–2531, 2022, doi: 10.1109/ACCESS.2021.3139956.
M. Bibi, W. Aziz, M. Almaraashi, I. H. Khan, M. S. A. Nadeem, and N. Habib, “A Cooperative Binary-Clustering Framework Based on Majority Voting for Twitter Sentiment Analysis,” IEEE Access, vol. 8, pp. 68580–68592, 2020, doi: 10.1109/ACCESS.2020.2983859.
M. A. El-Affendi, K. Alrajhi, and A. Hussain, “A Novel Deep Learning-Based Multilevel Parallel Attention Neural (MPAN) Model for Multidomain Arabic Sentiment Analysis,” IEEE Access, vol. 9, pp. 7508–7518, 2021, doi: 10.1109/ACCESS.2021.3049626.
Z. Kastrati, A. S. Imran, S. M. Daudpota, M. A. Memon, and M. Kastrati, “Soaring Energy Prices: Understanding Public Engagement on Twitter Using Sentiment Analysis and Topic Modeling with Transformers,” IEEE Access, vol. 11, no. February, pp. 26541–26553, 2023, doi: 10.1109/ACCESS.2023.3257283.
M. J. Kim, J. S. Kang, and K. Chung, “Word-embedding-based traffic document classification model for detecting emerging risks using sentiment similarity weight,” IEEE Access, vol. 8, pp. 183983–183994, 2020, doi: 10.1109/ACCESS.2020.3026585.
K. L. Tan, C. P. Lee, K. S. M. Anbananthen, and K. M. Lim, “RoBERTa-LSTM: A Hybrid Model for Sentiment Analysis With Transformer and Recurrent Neural Network,” IEEE Access, vol. 10, pp. 21517–21525, 2022, doi: 10.1109/ACCESS.2022.3152828.
T. Fontes, F. Murcos, E. Carneiro, J. Ribeiro, and R. J. F. Rossetti, “Leveraging Social Media as a Source of Mobility Intelligence: An NLP-Based Approach,” IEEE Open J. Intell. Transp. Syst., vol. 4, no. September, pp. 663–681, 2023, doi: 10.1109/OJITS.2023.3308210.
D. Van Thin, H. Quoc Ngo, D. Ngoc Hao, and N. Luu-Thuy Nguyen, “Exploring zero-shot and joint training cross-lingual strategies for aspect-based sentiment analysis based on contextualized multilingual language models,” J. Inf. Telecommun., 2023, doi: 10.1080/24751839.2023.2173843.
Y. A. Singgalen, “Digital marketing of smartphone manufacturing product?: toxicity , social network , and sentiment classification,” Int. J. Soc. Sci. Econ. Art, vol. 14, no. 1, pp. 73–86, 2024.
Y. A. Singgalen, “Sentiment Classification of Food Influencer Content Reviews using Support Vector Machine Model through CRISP-DM Framework,” J. Sist. Komput. dan Inform., vol. 5, no. 3, pp. 517–528, 2024, doi: 10.30865/json.v5i3.7509.
D. Amangeldi, A. Usmanova, and P. Shamoi, “Understanding Environmental Posts: Sentiment and Emotion Analysis of Social Media Data,” IEEE Access, vol. 12, no. March, pp. 33504–33523, 2024, doi: 10.1109/ACCESS.2024.3371585.
S. Gorissen, “Weathering and weaponizing the #TwitterPurge: digital content moderation and the dimensions of deplatforming,” Commun. Democr., vol. 00, no. 00, pp. 1–26, 2023, doi: 10.1080/27671127.2023.2264367.
N. Gamal, S. Ghoniemy, H. M. Faheem, and N. A. Seada, “Sentiment-Based Spatiotemporal Prediction Framework for Pandemic Outbreaks Awareness Using Social Networks Data Classification,” IEEE Access, vol. 10, no. July, pp. 76434–76469, 2022, doi: 10.1109/ACCESS.2022.3192417.
J. Khan, N. Ahmad, S. Khalid, F. Ali, and Y. Lee, “Sentiment and Context-Aware Hybrid DNN With Attention for Text Sentiment Classification,” IEEE Access, vol. 11, no. February, pp. 28162–28179, 2023, doi: 10.1109/ACCESS.2023.3259107.
P. Thiengburanathum and P. Charoenkwan, “SETAR: Stacking Ensemble Learning for Thai Sentiment Analysis Using RoBERTa and Hybrid Feature Representation,” IEEE Access, vol. 11, no. July, pp. 92822–92837, 2023, doi: 10.1109/ACCESS.2023.3308951.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Yerik Afrianto Singgalen

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

