Proceedings of International Scientific Conference „ALFATECH – Smart Cities and modern technologies“ (pp. 137-142)
AUTOR(I) / AUTHOR(S): Ana ĐOKIĆ
, Hana STEFANOVIĆ
and Dragana DUDIĆ 
Download Full Pdf 
DOI: 10.46793/ALFATECHproc25.137DJ
SAŽETAK / ABSTRACT:
Smart home systems generate large volumes of data from various sensors, such as temperature, humidity, motion, and energy consumption. Effective preprocessing is essential to enhance data quality, reduce noise, and enable accurate analysis. This paper examines Python-based data preprocessing techniques for smart home environments. The preprocessing workflow includes data cleaning, handling missing values, normalization, feature selection, and data transformation. Data cleaning addresses duplicate records, outliers, and inconsistencies, while missing values are imputed using statistical and machine learning approaches. Normalization techniques standardize sensor readings to ensure consistency across data points. Feature engineering and dimensionality reduction refine the dataset for improved predictive modeling. By enhancing data quality, preprocessing contributes to smarter home automation, efficient anomaly detection, and optimized energy management. This study underscores the critical role of preprocessing in smart home analytics, facilitating reliable and meaningful insights for decision-making. The proposed techniques enhance the integration of smart home data into machine learning models, driving advancements in intelligent home automation systems.
KLJUČNE REČI / KEYWORDS:
data preprocessing; Python programming language; smart homes
PROJEKAT / ACKNOWLEDGEMENT:
LITERATURA / REFERENCES:
- Alrefaei, A. & Ilyas, M. (2024). Using machine learning multiclass classification technique to detect IoT attacks in real time. Italian National Conference on Sensors. https://doi.org/10.3390/s24144516
- Babangida, L., Perumal, T., Mustapha, N., & Yaakob, R. (2022). Internet of things (iot) based activity recognition strategies in smart homes: a review. IEEE Sensors Journal. https://doi.org/10.1109/JSEN.2022.3161797
- Baydomu, G. K. (2021). The effects of normalization and standardization an internet of things attack detection. European Journal of Science and Technology. https://doi.org/10.31590/ejosat.1017427
- Bilal, M., Ali, G., Iqbal, M. W., Anwar, M., Malik, M. S. A., & Kadir, R. A. (2022). Auto-prep: efficient and automated data preprocessing pipeline. IEEE Access. https://doi.org/10.1109/ACCESS.2022.3198662
- Chang, J., Kang, M., & Park, D. (2022). Low-power on-chip implementation of enhanced svm algorithm for sensors fusionbased activity classification in lightweighted edge devices. Multidisciplinary Digital Publishing Institute. https://doi.org/10.3390/electronics11010139
- Essafi, K., & Moussaid, L. (2024). The potential of the internet of things for human activity recognition in smart home: overview, challenges, approaches. Indonesian Journal of Electrical Engineering and Computer Science. http://doi.org/10.11591/ijeecs.v36.i1.pp302-317
- Hunter, I., Elers, P., Lockhart, C., Guesgen, H. W., Singh, A., & Whiddett, D. (2020). Issues associated with the management and governance of sensor data and information to assist aging in place: focus group study with health care professionals. JMIR Publications. https://doi.org/10.2196/24157
- Kaur, J., Oetomo, A., Chauhan, V., & Morita, P. (2024). 0291 evaluating sleep quality metrics using zero-effort technology: implications for public health dynamics. Sleep. https://doi.org/10.1093/sleep/zsae067.0291
- Lasser J, Manik D, Silbersdorff A, Säfken B, Kneib T. Introductory data science across disciplines, using Python, case studies, and industry consulting projects. Teaching Statistics. 2021; 43: S190– S200. https://doi.org/10.1111/test.12243
- Madhukar, S. R., Singh, K., Kanniyappan, S., Krishnan, T., Sarode, G. C., & Suganthi, D. (2024). Towards efficient energy management of smart buildings: a lstm-ae based model. None. https://doi.org/10.1109/ICECCC61767.2024.10593988
- Maguluri, L., Shankar, M., Aruna, R., Devi, D. C., & Suganya, M. J. (2024). A comprehensive evaluation of machine learning algorithms for precise energy consumption forecasting in smart homes. None. https://doi.org/10.11591/ijpeds.v15.i4.pp2138-2144
- Mishra, P., Busetty, S. M., & Gudla, S. K. (2020). Enhanced activity recognition of the iot smart home users through cluster analysis. World Forum on Internet of Things. https://doi.org/10.1109/WF-IoT48130.2020.9221177
- Mittelsdorf, M., Hwel, A., Klingenberg, T., & Sonnenschein, M. (2018). Submeter based training of multi-class support vector machines for appliance recognition in home electricity consumption data. International Conference on Smart Grids and Green IT Systems. https://doi.org/10.5220/0004380001510158
- Mpawenimana, I., Pegatoquet, A., Roy, V., Rodriguez, L., & Belleudy, C. (2020). A comparative study of lstm and arima for energy load prediction with enhanced data preprocessing. Sensors Applications Symposium. https://doi.org/10.1109/SAS48726.2020.9220021
- Nef, T., Urwyler, P., Bchler, M., Tarnanas, I., Stucki, R., Cazzoli, D., Mri, R. M., & Mosimann, U. P. (2015). Evaluation of three state-of-the-art classifiers for recognition of activities of daily living from smart home ambient data. Multidisciplinary Digital Publishing Institute. https://doi.org/10.3390/s150511725
- Pirzada, P., White, N., & Wilde, A. (2018). Sensors in smart homes for independent living of the elderly. None. https://doi.org/10.1109/imtic.2018.8467234
- Raschka, S., Patterson, J., & Nolet, C. (2020). Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence. Information, 11(4), 193. https://doi.org/10.3390/info11040193
- Rizki, R. P., Hamidi, E. A. Z., Kamelia, L., & Sururie, R. (2020). Image processing technique for smart home security based on the principal component analysis (pca) methods. None. https://doi.org/10.1109/ICWT50448.2020.9243667
- Salehin, I., Islam, M. S., Saha, P., Noman, S., Tuni, A., Hasan, M. M., & Baten, M. A. (2023). AutoML: A systematic review on automated machine learning with neural architecture search. Journal of Information and Intelligence, 2(1), 52–81. https://doi.org/10.1016/j.jiixd.2023.10.002
- Tax, N. (2019). Mining insights from weakly-structured event data. Cornell University. https://doi.org/10.48550/arxiv.1909.01421
- Tayeb, H. F., Karabatak, M., & Varol, C. (2020). Time series database preprocessing for data mining using python. International Symposium on Digital Forensics and Security. https://doi.org/10.1109/ISDFS49300.2020.9116260
- Umamageswari S, & Kannan M. (2024). Smart home automation system for energy consumption using tensorflow-based deep ensemble learning. ICTACT Journal on Soft Computing. https://doi.org/10.21917/ijsc.2024.0461
- Wei, M., Zhao, D., Zhang, L., Wang, C., Zhang, Y., Wang, Q., Fan, X., Zhong, Y., & Mao, S. (2025). Wi-fitness: improving wi-fi sensing with video perception for smart fitness. IEEE Internet of Things Journal. https://doi.org/10.1109/JIOT.2024.3476291
- Yoon, N., Lee, S., Kim, S. K., Park, C., Kim, T., & Jin, H. (2024). Energy consumption prediction using cnn-lstm models: a time series big data analysis of electricity, heating, hot water, and water. None. https://doi.org/10.1109/ICCE-Asia63397.2024.10774020
- Yuan, Q. (2024). Development and verification of a machine learning based motor energy consumption prediction algorithm. None. https://doi.org/10.1109/PEEEC63877.2024.00028