Kanupriya Tiwari, Salma Jamal, Sonam Grover, Sukriti Goyal, Aditi Singh and Abhinav Grover Pages 667 - 675 ( 9 )
Background: Tuberculosis is the second leading cause of death from an infectious disease worldwide after HIV, thus reasoning the expeditions in antituberculosis research. The rising number of cases of infection by resistant forms of M. tuberculosis has given impetus to the development of novel drugs that have different targets and mechanisms of action against the bacterium. Methods: In this study, we have used machine learning algorithms on the available high throughput screening data of inhibitors of fructose bisphosphate aldolase, an enzyme central to the glycolysis pathway in M. tuberculosis, to build predictive classification models to identify actives against Mycobacterium tuberculosis, the causative organism of tuberculosis. We used Naïve Bayes, Random Forest and C4.5 J48 algorithms available from Weka were used for building predictive classification models. Additionally, a set of most relevant attributes was selected using genetic search algorithm which offered improved model performance by avoiding over fitting and generating faster and cost effective models. Results: The model built using machine learning methods in this study provided good accuracy of classification of test compounds which suggests that in silico methods can be successfully used for screening of large datasets to identify potential drug leads. The substructure fragment analysis serves to further potentiate the M. tuberculosis drug development process as it would facilitate identification of structural fragments that are responsible for biological activity against this crucial glycolysis pathway target.
Tuberculosis, fructose bisphosphate aldolase, cheminformatics, machine learning, substructure, glycolysis pathway.
School of Biotechnology, Jawaharlal Nehru University, New Delhi, India -110067.