Analysis methodology of pro-kremlin desinformation in internet news articles
DOI:
https://doi.org/10.32347/2411-4049.2026.2.154-160Keywords:
FIMI, Foreign Information Manipulation and Interference, OSINT, Open Source Intelligence, fake news detection, SAS, chi-square statisticAbstract
The article proposes and implements a methodology for analyzing pro-Kremlin disinformation in Internet sources based on the integration of automated data collection, natural language processing methods, topic modeling, and statistical analysis. The study utilized an open multilingual dataset containing 18,249 links to web articles in 42 languages, developed within the framework of the European anti-disinformation initiatives VERA.AI and EUvsDisinfo. The proposed methodology includes the stages of automated extraction of texts from web resources, text preprocessing, language filtering, thematic clustering, and the development of classification models using the SAS Text Miner system. For automated collection of textual content, a specialized Python-based software application was developed using the PLAYWRIGHT and ASYNCIO libraries, optimized for high-performance processing of large-scale web article corpora.
The results of the study revealed a significant relationship between the type of content, the language of the source, and the necessity of VPN access for retrieving texts. The Pearson chi-square statistic was 8847 with 10 degrees of freedom and a p-value < 0.000001, indicating a high statistical significance of the obtained results. It was found that Russian-language disinformation resources in most cases require the use of VPN access due to sanctions and geographical access restrictions, whereas trustworthy English-language and Ukrainian-language sources demonstrate substantially higher openness and accessibility stability. Thematic analysis showed that pro-Kremlin disinformation is concentrated around anti-Ukrainian, anti-NATO, and conspiracy-oriented narratives, demonstrating high thematic repetition and characteristics of coordinated FIMI campaigns. The proposed methodology can be applied in the fields of information security, OSINT analytics, information space monitoring, and the development of automated disinformation detection systems.
References
Leite, J., Razuvayevskaya, O., Bontcheva, K., & Scarton, C. (2024). EUvsDisinfo: A dataset for multilingual detection of pro-Kremlin disinformation in news articles [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.10492913
EUvsDisinfo: Official website. (n.d.). https://euvsdisinfo.eu/
Duda, V., & Terentiev, O. (n.d.). Program for download EUvsDisinfo [GitHub repository]. GitHub. https://github.com/oterentiev/download_EUvsDisinfo
Playwright: Browser automation library website. (n.d.). https://playwright.dev/
Python Software Foundation. (n.d.). Asyncio – Asynchronous I/O: Python 3 documentation. https://docs.python.org/3/library/asyncio.html
Jade, T., Belamaric-Wilsey, B., & Wallis, M. (2019). SAS text analytics for business applications (1st ed.): Concept rules for information extraction models. SAS Press. https://sasinstitute.redshelf.com/book/1878372
Terentiev, O. M., Duda, V. O., Abroskin, Yu. Yu., & Prosyankina-Zharova, T. I. (2026). Analysis of text analytics methods for knowledge extraction from Ukrainian-language social media. Environmental Safety and Natural Resources, 1(57), 161–170. https://doi.org/10.32347/2411-4049.2026.1.161-170 [in Ukrainian]
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 О.М. Терентьєв, Т.І. Просянкіна-Жарова, Ю.Ю. Аброскін, В.О. Дуда

This work is licensed under a Creative Commons Attribution 4.0 International License.
The journal «Environmental safety and natural resources» works under Creative Commons Attribution 4.0 International (CC BY 4.0).
The licensing policy is compatible with the overwhelming majority of open access and archiving policies.