Natural Language Processing in Contextual Modeling for Sentiment Analysis in Serbian and Languages of the Germanic-Romance Language Group

10th International Scientific Conference Technics, Informatics and Education – TIE 2024, str. 116-123

АУТОР(И) / AUTHOR(S): Marko M. Živanović

Download Full Pdf  

DOI: 10.46793/TIE24.116Z

САЖЕТАК /ABSTRACT:

This paper explores context modeling concepts for comparing natural language processing (NLP) of Serbian with the Romance-Germanic language group, focusing on Serbian, English, German, and French. The study delves into vector semantics and embedded representations, utilizing term-document and term-context matrices, cosine similarity measures, TF-IDF, and Pointwise Mutual Information matrices. A special emphasis is placed on the psychological context in defining affective computing and emotions. The research concludes with a sentiment analysis of forum texts originally in Serbian, translated into English, French, and German, highlighting the model’s varying results based on language complexity. Finally, the paper presents model metrics, including comparisons of the ROC/AUC curves, accuracy across various classifiers, and a detailed analysis using SVM classifiers.

КЉУЧНЕ РЕЧИ / KEYWORDS: 

Context Modeling; Natural language processing; Vector Semantics; Sentiment Analysis; Affective Computing

ЛИТЕРАТУРА / REFERENCES:

  1. Draskovic, V., Jovanovic, M., & Popovic, D. (2024). Development of Multilingual Models for Serbian Sentiment Analysis. Journal of Computational Linguistics, 45(3), 567-589.
  2. Zhang, X., Chen, Q., & Wang, Y. (2023). Cross-Lingual Sentiment Analysis with Large Language Models. arXiv preprint arXiv:2406.19358.
  3. Jelodar, H., Wang, Y., Yuan, C., Feng, X., Jiang, X., Li, Y., & Zhao, L. (2019). Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimedia Tools and Applications, 78, 15169-15211.
  4. Li, J., Chen, X., Hovy, E., Jurafsky, D. (2015). Visualizing and understanding neural models in NLP. arXiv preprint arXiv:1506.01066.
  5. Andrić, I. (1945). Na Drini ćuprija. Istorijski roman.
  6. Andrić, I. (1945). Travnička hronika. Istorijski roman.
  7. Andrić, I. (1945). Gospodjica. Izdavač Svijetlost.
  8. Andrić, I. (1954). Prokleta Avlija. Istorijski roman.
  9. Salton, G. (1971). The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice Hall.
  10. Maas, A., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y., & Potts, C. (2011, June). Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (pp.142-150).
  11. Rahutomo, F., Kitasuka, T., & Aritsugi, M. (2012, October). Semantic cosine similarity. In The 7th International Student Conference on Advanced Science and Technology (ICAST) (Vol. 4, No. 1, p. 1). University of Seoul.
  12. Luhn, H. P. (1957). A statistical approach to mechanized encoding and searching of literary information. IBM Journal of Research and Development, 1(4), 309-317.
  13. Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1), 11-21.
  14. Fano, R. M. (1961). Transmission of Information: A Statistical Theory of Communications. MIT Press.
  15. Church, K., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 22-29.
  16. Picard, R. W. (1997). Affective computing.
  17. Scherer, K. R. (2000). Psychological models of emotion. In The neuropsychology of emotion (pp. 137-162).
  18. Plutchik, R. (1980). A general psychoevolutionary theory of emotion. In Theories of emotion (pp. 3-33). Academic Press.
  19. Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161-1178
  20. Schwartz, H. A., et al. (2013). Personality, gender, and age in the language of social media: The open-vocabulary approach. PloS One, 8(9), e73791.