Prediction of the Biological Activity of Compounds on the 11β-HSD2 Enzyme Using Different Machine Learning Approaches

3rd International Conference on Chemo and BioInformatics, Kragujevac, September 25-26. 2025. (pp. 340-354) 

 

АУТОР(И) / AUTHOR(S): Sofija Stanojlović

 

Download Full Pdf   

DOI:  10.46793/ICCBIKG25.340S

САЖЕТАК / ABSTRACT:

This study explored different machine learning approaches for predicting the biological activity of molecules against the enzyme 11β-hydroxysteroid dehydrogenase type 2 (11β-HSD2). This enzyme plays a key role in regulating cortisol levels, and its inhibition is associated with the development of certain types of cancer. The aim of the research was to identify molecules with high biological activity that could serve as potential inhibitors of this enzyme. The data used in the study were obtained from the ChEMBL database, while the prediction of biological activity (pIC50) was carried out using regression models based on Lipinski and PADEL descriptors. Through hyperparameter optimization with grid search, the most accurate models for each approach were identified. The best performance was achieved using PADEL descriptors, where the LGBM model obtained an R² of 0.65, MAE of 0.5, and MSE of 0.57. For Lipinski descriptors, the best model was Random Forest, with an R² of 0.51, MAE of 0.6, and MSE of 0.8. Additionally, an encoder-decoder architecture was examined, achieving an R² of 0.5, which indicates its potential for this type of prediction. The results show that PADEL descriptors provide better performance compared to Lipinski descriptors. However, none of the models achieved high precision, suggesting the need to expand the dataset and further optimize the models. This research lays the foundation for applying advanced methods, such as Graph Neural Networks (GNNs), to further investigate key molecular features relevant to biological activity. Such an approach has the potential to accelerate drug discovery and the selection of pharmaceutical candidates, enabling the development of new molecules with optimal properties.

КЉУЧНЕ РЕЧИ / KEYWORDS:

Machine learning, 11β-HSD2, PADEL descriptors, Biological activity (pIC50), Drug discovery

ПРОЈЕКАТ / ACKNOWLEDGEMENT:

ЛИТЕРАТУРА / REFERENCES: