Elektrane (2025) [pp. 536-544]
AUTHOR(S) / АУТОР(И): I. D. Tomanović, S. V. Belošević, A. R. Milićević, N. Đ. Crnomarković
Download Full Pdf 
DOI: https://doi.org/10.46793/EEP25.536T
ABSTRACT / САЖЕТАК:
This Developing artificial intelligence (AI) systems for industrial applications requires large, diverse, and reliable datasets to ensure accurate and generalizable predictions. While on-site measurements provide trustworthy data for power plant furnace operation, such datasets typically cover only a limited number of operating configurations and therefore cannot be directly used for comprehensive AI training. To address this limitation, validated computational fluid dynamics (CFD) models can be utilized to generate additional synthetic data, creating a hybrid dataset that combines real-world measurements with simulation-based results of sufficient physical fidelity.
In this work, a modular set of command-line software tools was developed to automate the generation, management, and processing of hybrid datasets for AI model training. The toolset includes independent modules for: (1) random test-case generation with user-defined parameter ranges, (2) distribution of simulation cases into groups suitable for execution on different computers, (3) controlled execution of in-house developed code for CFD simulations on multicore systems regarding computing resources, and (4) merging and post-processing of collected results into a single dataset. An additional monitoring utility provides real-time information about execution progress, performance, and resource utilization, enabling efficient supervision of large-scale simulation. The framework supports flexible deployment, from single-computer setups to multi-node environments.
The in-house developed CFD code is adapted for integration with the proposed toolchain. The modifications enable automated execution, parameter input through generated configuration files, and standardized output suitable for dataset merging and AI preprocessing, while ensuring compatibility with the automated workflow and preserving the physical accuracy and stability of the CFD solver.
The proposed system allows for efficient generation of large-scale hybrid datasets, bridging the gap between limited real-world measurements and extensive data needed for AI training in industry.
KEYWORDS / КЉУЧНЕ РЕЧИ:
Automated Framework, Test-Case Generation, CFD, Hybrid AI Datasets
ACKNOWLEDGEMENT / ПРОЈЕКАТ:
REFERENCES / ЛИТЕРАТУРA:
[1] Welser, J., et al., Future Computing Hardware for AI, Proceedings, 2018 IEEE International Electron Devices Meeting (IEDM), San Francisco, CA, USA, December 2018, pp. 1.3.1-1.3.6
[2] Gong, Y., et al., A Survey On Dataset Quality In Machine Learning, Information and Software Technology, 162 (2023), pp. 107268
[3] Wang, Z., et al., BoilerNet: Deep Reinforcement Learning-Based Combustion Optimization Network For Pulverized Coal Boiler, Energy, 318 (2025), pp. 134804
[4] Figueira, A., Vaz, B., Survey On Synthetic Data Generation, Evaluation Methods And GANs, Mathematics 2022, Vol. 10, Page 2733, 10 (2022), 15, pp. 2733
[5] Alabed, A.T.H., et al., Bridging Reality And Synthetics: Optimizing Image Classification With Hybrid AI-Generated And Real-World Datasets, SN Computer Science, 6 (2025), 6, pp. 1-13
[6] Belošević, S., et al., Full-Scale CFD Investigation Of Gas-Particle Flow, Interactions And Combustion In Tangentially Fired Pulverized Coal Furnace, Energy, 179 (2019), pp. 1036-1053
[7] Belošević, S., et al., Numerical Study Of Pulverized Coal-Fired Utility Boiler Over A Wide Range Of Operating Conditions For In-Furnace SO2/NOx Reduction, Applied Thermal Engineering, 94 (2016), pp. 657-669
