Articles Service
Research article
Computational Identification of Essential Enzymes as Potential Drug Targets in Shigella flexneri Pathogenesis Using Metabolic Pathway Analysis and Epitope Mapping
Amity Institute of Biotechnology, Amity University Uttar Pradesh, Sector 125, Noida-201303, U.P., India
Correspondence to:J. Microbiol. Biotechnol. 2021; 31(4): 621-629
Published April 28, 2021 https://doi.org/10.4014/jmb.2007.07006
Copyright © The Korean Society for Microbiology and Biotechnology.
Abstract
Keywords
Graphical Abstract
Introduction
With the advent of the post-genomic era, since the human genome project was completed successfully, there has been a revolution in the development of drug-designing approaches. The experimental approaches for drug designing are time consuming and costly. Host pathogen interactions can be studied by identifying the non-homologous proteins for host and pathogens. These proteins can be treated as targets in the drug discovery process. Pathway analysis using comparative methods has been the most sought-after approach in the last decade. A number of studies have been published in the past with reference to different pathogens on the basis of metabolic pathway analysis and protein-protein interaction studies [7-11]. However, an in-depth Gene Ontology analysis and epitope prediction to identify the putative targets have not been performed in previous work. These steps are essential because while Gene Ontology studies help us in identifying important features such as molecular function and cellular processes for a better understanding of the targets, epitope prediction helps us in identifying the main antigenic properties of the micro organisms. Our results indicate better reliability of the predicted targets for further validation by experimental studies. In the current study, we performed a metabolic pathway comparison between the pathogen and the host to identify essential enzymes for the survival of the bacteria and based on our prediction we have identified potential drug targets for the pathogen. The process began with the identification of metabolic pathways from the KEGG database for both the host and the pathogen. Next, we manually compared the pathways to identify those unique to the pathogen. Further, all enzymes for the unique pathways were extracted and submitted to an online tool for identification. The identified essential enzymes were further screened to determine the feasibility of therapeutic targets that were predicted and analyzed using novel drug target identification, cellular localization, gene ontology analysis and epitope prediction.
Methodology
A schematic representation of the methodology is given in Fig. 1.
-
Fig. 1. A schematic representation of the methodology.
Comparative Analysis of Host and Pathogen Metabolic Pathways
The extraction of the metabolic pathways was done using the KEGG [12] pathway database for the host
Identification of Non-Homologous Essential Genes
Protein sequences extracted in the FASTA format which were part of the unique pathways were submitted to the Geptop tool [13] to identify their essentiality in the pathogen. Geptop is a server used to identify essential genes of bacterial species by comparing their orthology and phylogeny with the essential gene database DEG. These essential genes were searched against proteins from the human RefSeq protein database for non-homology using NCBI-BLASTP [14]. Proteins having identity below 35% and an E-value cutoff of 0.005 were selected as non-host proteins.
Protein Network Analysis
Functional interactions take place between genes/proteins and provide fundamental knowledge for cellular processing and systematic characterization, which play a vital role in molecular systems biology [15]. The PPI (Protein-Protein Interaction) network of all non-homologous proteins was built in Cytoscape v3.7.2 [16] using the STRING app. The interaction of network data was analyzed by network analyzer module [17]. The detection of the functional module of non-homologous proteins was done by MCODE plugin [18] under the degree cutoff = 2, maximum depth = 100, k-core = 2, and node score cutoff = 0.2.The uppermost hierarchical module was chosen as the utmost possible metabolic functional associations of the interacting proteins selected for further analysis.
Subcellular Localization and Identification of Novel Drug Targets
Subcellular localization of the essential non-human proteins selected from network analysis was predicted by PSORTb v3.0.2 [19] and CELLO v2.518. Transmembrane proteins were identified by TMHMM Server v. 2.0. TMHMM server is based on the hidden Markov model. To find the most probable topology of a membrane protein, N-best algorithm was used. Proteins with a transmembrane helix predicted as having fewer than 50 amino acid residues from the N terminus were extracted as likely candidates for signal peptides. In addition, if a cleavage site was predicted with a probability > 0.5, the predicted signal peptide was cleaved off and the prediction was redone [20,21]. Only cytoplasmic and transmembrane proteins were selected as novel drug targets [23]. Further, the DrugBank databases [22] was used to identify novel targets amongst selected potential targets with an E-value of less than 10-5, sequence identity greater than 35%, and a bit score greater than 100.
Gene Ontology and B-Cell Epitope Prediction
Novel drug targets of
Results
Comparative Metabolic Pathway Analysis
Metabolic pathways of host and pathogen were examined using the KEGG database. Relative investigation was executed manually for the recognizable proof of pathways exclusive to
-
Table 1 . List of metabolic pathways unique to
Shigella flexneri .S. No. Pathway ID Pathway name 1 00660 C5-Branched dibasic acid metabolism 2 00680 Methane metabolism 3 00121 Secondary bile acid biosynthesis 4 00300 Lysine biosynthesis 5 00460 Cyanoamino acid metabolism 6 00473 D-Alanine metabolism 7 00540 Lipopolysaccharide biosynthesis 8 00550 Peptidoglycan biosynthesis 9 00903 Limonene and pinene degradation 10 00281 Geraniol degradation 11 00523 Polyketide sugar unit biosynthesis 12 01053 Biosynthesis of siderophore group nonribosomal peptides 13 00332 Carbapenem biosynthesis 14 00261 Monobactam biosynthesis 15 00521 Streptomycin biosynthesis 16 00525 Acarbose and validamycin biosynthesis 17 00401 Novobiocin biosynthesis 18 00362 Benzoate degradation 19 00627 Aminobenzoate degradation 20 00364 Fluorobenzoate degradation 21 00625 Chloroalkane and chloroalkene degradation 22 00361 Chlorocyclohexane and chlorobenzene degradation 23 00623 Toluene degradation 24 00633 Nitrotoluene degradation 25 00930 Caprolactam degradation 26 00626 Naphthalene degradation
Identification of Essential Genes
All the enzymes associated with these 26 unique pathways were identified and examined for their essentiality to the pathogen by using the tool Geptop 2.0. A total of 4179 genes are submitted and 395 of them are predicted as essential genes. Their accession no. and name were accessed from NCBI. BLASTP search was performed specifically against
Protein-Protein Interaction Network Analysis
Functional associations between 269 non-host, essential genes of
-
Fig. 2. Protein-protein interaction network of non-host essential proteins from
Shigella flexneri .
Prediction of Subcellular Localization and Identification of Novel Drug Targets
Subcellular localization of 57 proteins revealed that 50 proteins were cytoplasmic, 4 were extracellular and 3 were transmembrane proteins (Fig. 2). Next, unveiling of novel targets was conducted using the DrugBank database. Proteins showing no matching hits against the DrugBank database at the threshold were nominated as novel drug targets. The results revealed 26 proteins that were uniquely involved in pathogen-specific unique pathways (Table 2).
-
Table 2 . List of proteins selected as novel drug targets.
S No. Accession no. Protein names Subcellular localization Novel drug targets 1 NP_710014 Elongation factor P (EF-P) Cytoplasm No Hits 2 NP_706048 Cell division protein FtsQ Transmembrane No Hits 3 NP_706769 Translation initiation factor IF-1 Cytoplasm No Hits 4 NP_707396 Translation initiation factor IF-3 Cytoplasm No Hits 5 NP_709777 Transcription termination Cytoplasm No Hits 6 NP_709088 Protein translocase subunit SecY Transmembrane No Hits 7 NP_710066 50S ribosomal protein L9 Cytoplasm No Hits 8 NP_709089 50S ribosomal protein L15 Cytoplasm No Hits 9 NP_709082 50S ribosomal protein L17 Cytoplasm No Hits 10 NP_709092 50S ribosomal protein L18 Cytoplasm No Hits 11 NP_708985 50S ribosomal protein L21 Cytoplasm No Hits 12 NP_709106 50S ribosomal protein L23 Cytoplasm No Hits 13 NP_709097 50S ribosomal protein L24 Cytoplasm No Hits 14 NP_709416 50S ribosomal protein L28 Cytoplasm No Hits 15 NP_709100 50S ribosomal protein L29 Cytoplasm No Hits 16 NP_709090 50S ribosomal protein L30 Cytoplasm No Hits 17 NP_709740 50S ribosomal protein L31 Cytoplasm No Hits 18 NP_707005 50S ribosomal protein L32 Cytoplasm No Hits 19 NP_709497 50S ribosomal protein L34 Cytoplasm No Hits 20 NP_707397 50S ribosomal protein L35 Cytoplasm No Hits 21 NP_709083 DNA-directed RNA polymerase subunit alpha Cytoplasm No Hits 22 NP_706830 30S ribosomal protein S1 Cytoplasm No Hits 23 NP_708875 30S ribosomal protein S21 Cytoplasm No Hits 24 NP_709776 Protein translocase subunit SecE Transmembrane No Hits 25 NP_706115 Elongation factor Ts (EF-Ts) Cytoplasm No Hits 26 NP_708456 Ribosome maturation factor RimM Cytoplasm No Hits
Gene Ontology and B-Cell Epitope Mapping
Gene Ontology analysis revealed interesting information on the drug proteins identified. Under the category of Biological Process, it was observed that 73% of the identified targets belonged to the “Translation” process. Gene Ontology for Molecular Function revealed that 53% of the proteins were associated with “structural constituent of ribosome”. Under the Cellular Component section, most of the drug targets belonged to the “ribosome” compartment. Further, performing an InterPro scan revealed an equal distribution amongst 4 major protein signatures: Ribosomal protein L2 domain 2, Translation protein SH3 like domain, Nucleic acid-binding domain, KOW and RNA-binding domain S1 (Fig. 4). In addition, B-cell epitope mapping was performed by ABCpred Prediction Server based on artificial neural network. The predicted B-cell epitopes were ordered based on their score attained. The top five highest ranked epitopes with a score of > 9.0 were selected as highest probable epitope (Table 3).
-
Table 3 . List of predicted epitopes having score value greater than threshold.
Rank Accession no. Protein Name Sequence Start position Score 1 NP_710014 Elongation factor P (EF-P) KVPLFVQIGEVIKVDTRSGE 199 0.96 2 NP_709083 DNA-directed RNA polymerase subunit alpha VILTLNKSGIGPVTAADITH 3643 0.94 3 NP_706830 30S ribosomal protein S1 VTGVINGKVKGGFTVELNGI 4018 0.93 4 NP_706048 Cell division protein FtsQ AAMTARRSWQLTLNNDIKLN 458 0.92 4 NP_709106 50S ribosomal protein L23 STAMEKSNTIVLKVAKDATK 2582 0.92 5 NP_706115 Elongation factor Ts (EF-Ts) NMRKSGAIKAAKKAGNVAAD 4827 0.91
Discussion
The present study focused on subtractive genome analysis (Fig. 5) that led to identify the proteins which can be used as potential targets for drug development against the pathogenicity of
Conclusions
Supplemental Materials
Acknowledgments
The authors are highly grateful to founder President Dr. Ashok K Chauhan and Chancellor Mr. Atul Chauhan Amity University Uttar Pradesh, Noida, India for providing necessary support and facilities.
Conflict of Interest
The authors have no financial conflicts of interest to declare.
References
- Lanata CF, Fischer-Walker CL, Olascoaga AC, Torres CX, Aryee RB. 2013. Global causes of diarrheal disease mortality in children <5 years of age: a systematic review.
PLoS One 8 : e72788. - Mandal J, Subhash MS, Parija SC, VG, Emelda. 2012. The recent trends of Shigellosis: a JIPMER perspective.
J. Clin. Diagn. Res. 6 : 1474-1477. - von Seidlein L, Kim DR, Ali M, Lee H, Wang X, Thiem VD,
et al . 2006. A multicentre study ofShigella diarrhoea in six Asian countries: disease burden, clinical manifestations, and microbiology.PLoS Med. 3 : e353. - Naheed A, Kalluri P, Talukder KA, Faruque ASG, Khatun F, Nair GB,
et al . 2004. Fluoroquinolone-resistantShigella dysenteriae type 1 in northeastern Bangladesh.Lancet Infect. Dis. 4 : 607-608. - Sivapalasingam S, Nelson JM, Joyce K, Hoekstra M, Angulo FJ, Mintz ED. 2006. High prevalence of antimicrobial resistance among
Shigella isolates in the United States tested by the National Antimicrobial Resistance Monitoring System from 1999 to 2002.Antimicrob. Agents Chemother. 50 : 49-54. - Dutta D, Bhattacharya MK, Dutta S, Datta A, Sarkar D, Bhandari B,
et al . 2003. Emergence of multidrug-resistantShigella dysenteriae type 1 causing sporadic outbreak in and around Kolkata, India.J. Health Popul. Nutr. 21 : 79-80. - Uddin R, Sufian M. 2016. Core proteomic analysis of unique metabolic pathways of
Salmonella enterica for the identification of potential drug targets.PLoS One 11 : e0146796. - Hema K, Priyadarshini VI, Pradhan D, Munikumar M, Sandeep S, Pradeep N,
et al . 2015. Identification of putative drug targets and vaccine candidates for pathogens causing atherosclerosis.Biochem. Anal. Biochem. 4 : 1. - Vetrivel U, Subramanian G, Dorairaj S. 2011. A novel in silico approach to identify potential therapeutic targets in human bacterial pathogens.
Hugo J. 5 : 25-34. - Hasan MA, Khan MA, Sharmin T, Hasan Mazumder MH, Chowdhury AS. 2016. Identification of putative drug targets in Vancomycin-resistant Staphylococcus aureus (VRSA) using computer aided protein data analysis.
Gene 575 : 132-143. - Munikumar M, Priyadarshini IV, Pradhan D, Sandeep S, Umamaheswari A, Vengamma B. 2012. In silico identification of common putative drug targets among the pathogens of bacterial meningitis.
Biochem. Anal. Biochem. 1 : 123. - Kanehisa M, Goto S, Kawashima S, Nakaya A. 2002. The KEGG databases at GenomeNet.
Nucleic Acids Res. 30 : 42-46. - Wei W, Ning LW, Ye YN, Guo FB. 2013. Geptop: a gene essentiality prediction tool for sequenced bacterial genomes based on orthology and phylogeny.
PLoS One 8 : e72343. - Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool.
J. Mol. Biol. 215 : 403-410. - Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J,
et al . 2015. STRING v10: protein-protein interaction networks, integrated over the tree of life.Nucleic Acids Res 43(Database issue) : D447-D452. - Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C,
et al . 2007. Integration of biological networks and gene expression data using Cytoscape.Nat. Protoc. 2 : 2366-2382. - Assenov Y, Ramírez F, Schelhorn SE, Lengauer T, Albrecht M. 2008. Computing topological parameters of biological networks.
Bioinformatics 24 : 282-284. - Bader GD, Hogue CW. 2003. An automated method for finding molecular complexes in large protein interaction networks.
BMC Bioinformatics 4 : 2. - Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R,
et al . 2010. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.Bioinformatics 26 : 1608-1615. - Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.
J. Mol. Biol. 305 : 567-580. - Sonnhammer ELL, von Heijne G, Krogh A. 1998. A hidden Markov model for predicting transmembrane helices in protein sequences, pp. 175-182.
In: Glasgow J, Littlejohn T, Major F, Lathrop R, Sankoff D, Sensen C (eds),Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology . Menlo Park, CA, USA. - Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grane JR,
et al . 2018. DrugBank 5.0: a major update to the DrugBank database for 2018.Nucleic Acids Res 46(D1) : D1074-D1082. - Yang H, Qin C, Li YH, Tao L, Zhou J, Vu F,
et al . 2016. Therapeutic target database update 2016: enriched resource for bench to clinical drug target and targeted pathway information.Nucleic Acids Res. 44(D1) : D1069-D1074. - Huang da W, Sherman BT, Lempicki RA. 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.
Nat. Protoc. 4 : 44-57. - Huang da W, Sherman BT, Lempicki RA. 2009. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.
Nucleic Acids Res. 37 : 1-13. - Saha S, Raghava GP. 2007. Prediction methods for B-cell epitopes.
Methods Mol. Biol. 409 : 387-394. - Glas M, McLaughlin SH, Roseboom W, Liu F, Koningstein GM, Fish A,
et al . 2015. The soluble periplasmic domains ofEscherichia coli cell division proteins FtsQ/FtsB/FtsL form a trimeric complex with submicromolar affinity.J. Biol. Chem. 290 : 21498-21509. - Buddelmeijer N, Beckwith J. 2004. A complex of the
Escherichia coli cell division proteins FtsL, FtsB and FtsQ forms independently of its localization to the septal region.Mol. Microbiol. 52 : 1315-1327. - Den Blaauwen T, Andreu JM, Monasterio O. 2014. Bacterial cell division proteins as antibiotic targets.
Bioor. Chem. 55 : 27-38. - Lock RL, Harry EJ. 2008. Cell-division inhibitors: new insights for future antibiotics.
Nat. Rev. Drug Discov. 7 : 324-338.
Related articles in JMB
Article
Research article
J. Microbiol. Biotechnol. 2021; 31(4): 621-629
Published online April 28, 2021 https://doi.org/10.4014/jmb.2007.07006
Copyright © The Korean Society for Microbiology and Biotechnology.
Computational Identification of Essential Enzymes as Potential Drug Targets in Shigella flexneri Pathogenesis Using Metabolic Pathway Analysis and Epitope Mapping
Priyanka Narad, Himanshu, and Hina Bansal*
Amity Institute of Biotechnology, Amity University Uttar Pradesh, Sector 125, Noida-201303, U.P., India
Correspondence to:Hina Bansal, hbansal@amity.edu
Abstract
Shigella flexneri is a facultative intracellular pathogen that causes bacillary dysentery in humans. Infection with S. flexneri can result in more than a million deaths yearly and most of the victims are children in developing countries. Therefore, identifying novel and unique drug targets against this pathogen is instrumental to overcome the problem of drug resistance to the antibiotics given to patients as the current therapy. In this study, a comparative analysis of the metabolic pathways of the host and pathogen was performed to identify this pathogen’s essential enzymes for the survival and propose potential drug targets. First, we extracted the metabolic pathways of the host, Homo sapiens, and pathogen, S. flexneri, from the KEGG database. Next, we manually compared the pathways to categorize those that were exclusive to the pathogen. Further, all enzymes for the 26 unique pathways were extracted and submitted to the Geptop tool to identify essential enzymes for further screening in determining the feasibility of the therapeutic targets that were predicted and analyzed using PPI network analysis, subcellular localization, druggability testing, gene ontology and epitope mapping. Using these various criteria, we narrowed it down to prioritize 5 novel drug targets against S. flexneri and one vaccine drug targets against all strains of Shigella. Hence, we suggest the identified enzymes as the best putative drug targets for the effective treatment of S. flexneri.
Keywords: Essential enzymes, in silico comparative analysis, KEGG database, metabolic pathway analysis, subcellular localization
Introduction
With the advent of the post-genomic era, since the human genome project was completed successfully, there has been a revolution in the development of drug-designing approaches. The experimental approaches for drug designing are time consuming and costly. Host pathogen interactions can be studied by identifying the non-homologous proteins for host and pathogens. These proteins can be treated as targets in the drug discovery process. Pathway analysis using comparative methods has been the most sought-after approach in the last decade. A number of studies have been published in the past with reference to different pathogens on the basis of metabolic pathway analysis and protein-protein interaction studies [7-11]. However, an in-depth Gene Ontology analysis and epitope prediction to identify the putative targets have not been performed in previous work. These steps are essential because while Gene Ontology studies help us in identifying important features such as molecular function and cellular processes for a better understanding of the targets, epitope prediction helps us in identifying the main antigenic properties of the micro organisms. Our results indicate better reliability of the predicted targets for further validation by experimental studies. In the current study, we performed a metabolic pathway comparison between the pathogen and the host to identify essential enzymes for the survival of the bacteria and based on our prediction we have identified potential drug targets for the pathogen. The process began with the identification of metabolic pathways from the KEGG database for both the host and the pathogen. Next, we manually compared the pathways to identify those unique to the pathogen. Further, all enzymes for the unique pathways were extracted and submitted to an online tool for identification. The identified essential enzymes were further screened to determine the feasibility of therapeutic targets that were predicted and analyzed using novel drug target identification, cellular localization, gene ontology analysis and epitope prediction.
Methodology
A schematic representation of the methodology is given in Fig. 1.
-
Figure 1. A schematic representation of the methodology.
Comparative Analysis of Host and Pathogen Metabolic Pathways
The extraction of the metabolic pathways was done using the KEGG [12] pathway database for the host
Identification of Non-Homologous Essential Genes
Protein sequences extracted in the FASTA format which were part of the unique pathways were submitted to the Geptop tool [13] to identify their essentiality in the pathogen. Geptop is a server used to identify essential genes of bacterial species by comparing their orthology and phylogeny with the essential gene database DEG. These essential genes were searched against proteins from the human RefSeq protein database for non-homology using NCBI-BLASTP [14]. Proteins having identity below 35% and an E-value cutoff of 0.005 were selected as non-host proteins.
Protein Network Analysis
Functional interactions take place between genes/proteins and provide fundamental knowledge for cellular processing and systematic characterization, which play a vital role in molecular systems biology [15]. The PPI (Protein-Protein Interaction) network of all non-homologous proteins was built in Cytoscape v3.7.2 [16] using the STRING app. The interaction of network data was analyzed by network analyzer module [17]. The detection of the functional module of non-homologous proteins was done by MCODE plugin [18] under the degree cutoff = 2, maximum depth = 100, k-core = 2, and node score cutoff = 0.2.The uppermost hierarchical module was chosen as the utmost possible metabolic functional associations of the interacting proteins selected for further analysis.
Subcellular Localization and Identification of Novel Drug Targets
Subcellular localization of the essential non-human proteins selected from network analysis was predicted by PSORTb v3.0.2 [19] and CELLO v2.518. Transmembrane proteins were identified by TMHMM Server v. 2.0. TMHMM server is based on the hidden Markov model. To find the most probable topology of a membrane protein, N-best algorithm was used. Proteins with a transmembrane helix predicted as having fewer than 50 amino acid residues from the N terminus were extracted as likely candidates for signal peptides. In addition, if a cleavage site was predicted with a probability > 0.5, the predicted signal peptide was cleaved off and the prediction was redone [20,21]. Only cytoplasmic and transmembrane proteins were selected as novel drug targets [23]. Further, the DrugBank databases [22] was used to identify novel targets amongst selected potential targets with an E-value of less than 10-5, sequence identity greater than 35%, and a bit score greater than 100.
Gene Ontology and B-Cell Epitope Prediction
Novel drug targets of
Results
Comparative Metabolic Pathway Analysis
Metabolic pathways of host and pathogen were examined using the KEGG database. Relative investigation was executed manually for the recognizable proof of pathways exclusive to
-
Table 1 . List of metabolic pathways unique to
Shigella flexneri ..S. No. Pathway ID Pathway name 1 00660 C5-Branched dibasic acid metabolism 2 00680 Methane metabolism 3 00121 Secondary bile acid biosynthesis 4 00300 Lysine biosynthesis 5 00460 Cyanoamino acid metabolism 6 00473 D-Alanine metabolism 7 00540 Lipopolysaccharide biosynthesis 8 00550 Peptidoglycan biosynthesis 9 00903 Limonene and pinene degradation 10 00281 Geraniol degradation 11 00523 Polyketide sugar unit biosynthesis 12 01053 Biosynthesis of siderophore group nonribosomal peptides 13 00332 Carbapenem biosynthesis 14 00261 Monobactam biosynthesis 15 00521 Streptomycin biosynthesis 16 00525 Acarbose and validamycin biosynthesis 17 00401 Novobiocin biosynthesis 18 00362 Benzoate degradation 19 00627 Aminobenzoate degradation 20 00364 Fluorobenzoate degradation 21 00625 Chloroalkane and chloroalkene degradation 22 00361 Chlorocyclohexane and chlorobenzene degradation 23 00623 Toluene degradation 24 00633 Nitrotoluene degradation 25 00930 Caprolactam degradation 26 00626 Naphthalene degradation
Identification of Essential Genes
All the enzymes associated with these 26 unique pathways were identified and examined for their essentiality to the pathogen by using the tool Geptop 2.0. A total of 4179 genes are submitted and 395 of them are predicted as essential genes. Their accession no. and name were accessed from NCBI. BLASTP search was performed specifically against
Protein-Protein Interaction Network Analysis
Functional associations between 269 non-host, essential genes of
-
Figure 2. Protein-protein interaction network of non-host essential proteins from
Shigella flexneri .
Prediction of Subcellular Localization and Identification of Novel Drug Targets
Subcellular localization of 57 proteins revealed that 50 proteins were cytoplasmic, 4 were extracellular and 3 were transmembrane proteins (Fig. 2). Next, unveiling of novel targets was conducted using the DrugBank database. Proteins showing no matching hits against the DrugBank database at the threshold were nominated as novel drug targets. The results revealed 26 proteins that were uniquely involved in pathogen-specific unique pathways (Table 2).
-
Table 2 . List of proteins selected as novel drug targets..
S No. Accession no. Protein names Subcellular localization Novel drug targets 1 NP_710014 Elongation factor P (EF-P) Cytoplasm No Hits 2 NP_706048 Cell division protein FtsQ Transmembrane No Hits 3 NP_706769 Translation initiation factor IF-1 Cytoplasm No Hits 4 NP_707396 Translation initiation factor IF-3 Cytoplasm No Hits 5 NP_709777 Transcription termination Cytoplasm No Hits 6 NP_709088 Protein translocase subunit SecY Transmembrane No Hits 7 NP_710066 50S ribosomal protein L9 Cytoplasm No Hits 8 NP_709089 50S ribosomal protein L15 Cytoplasm No Hits 9 NP_709082 50S ribosomal protein L17 Cytoplasm No Hits 10 NP_709092 50S ribosomal protein L18 Cytoplasm No Hits 11 NP_708985 50S ribosomal protein L21 Cytoplasm No Hits 12 NP_709106 50S ribosomal protein L23 Cytoplasm No Hits 13 NP_709097 50S ribosomal protein L24 Cytoplasm No Hits 14 NP_709416 50S ribosomal protein L28 Cytoplasm No Hits 15 NP_709100 50S ribosomal protein L29 Cytoplasm No Hits 16 NP_709090 50S ribosomal protein L30 Cytoplasm No Hits 17 NP_709740 50S ribosomal protein L31 Cytoplasm No Hits 18 NP_707005 50S ribosomal protein L32 Cytoplasm No Hits 19 NP_709497 50S ribosomal protein L34 Cytoplasm No Hits 20 NP_707397 50S ribosomal protein L35 Cytoplasm No Hits 21 NP_709083 DNA-directed RNA polymerase subunit alpha Cytoplasm No Hits 22 NP_706830 30S ribosomal protein S1 Cytoplasm No Hits 23 NP_708875 30S ribosomal protein S21 Cytoplasm No Hits 24 NP_709776 Protein translocase subunit SecE Transmembrane No Hits 25 NP_706115 Elongation factor Ts (EF-Ts) Cytoplasm No Hits 26 NP_708456 Ribosome maturation factor RimM Cytoplasm No Hits
Gene Ontology and B-Cell Epitope Mapping
Gene Ontology analysis revealed interesting information on the drug proteins identified. Under the category of Biological Process, it was observed that 73% of the identified targets belonged to the “Translation” process. Gene Ontology for Molecular Function revealed that 53% of the proteins were associated with “structural constituent of ribosome”. Under the Cellular Component section, most of the drug targets belonged to the “ribosome” compartment. Further, performing an InterPro scan revealed an equal distribution amongst 4 major protein signatures: Ribosomal protein L2 domain 2, Translation protein SH3 like domain, Nucleic acid-binding domain, KOW and RNA-binding domain S1 (Fig. 4). In addition, B-cell epitope mapping was performed by ABCpred Prediction Server based on artificial neural network. The predicted B-cell epitopes were ordered based on their score attained. The top five highest ranked epitopes with a score of > 9.0 were selected as highest probable epitope (Table 3).
-
Table 3 . List of predicted epitopes having score value greater than threshold..
Rank Accession no. Protein Name Sequence Start position Score 1 NP_710014 Elongation factor P (EF-P) KVPLFVQIGEVIKVDTRSGE 199 0.96 2 NP_709083 DNA-directed RNA polymerase subunit alpha VILTLNKSGIGPVTAADITH 3643 0.94 3 NP_706830 30S ribosomal protein S1 VTGVINGKVKGGFTVELNGI 4018 0.93 4 NP_706048 Cell division protein FtsQ AAMTARRSWQLTLNNDIKLN 458 0.92 4 NP_709106 50S ribosomal protein L23 STAMEKSNTIVLKVAKDATK 2582 0.92 5 NP_706115 Elongation factor Ts (EF-Ts) NMRKSGAIKAAKKAGNVAAD 4827 0.91
Discussion
The present study focused on subtractive genome analysis (Fig. 5) that led to identify the proteins which can be used as potential targets for drug development against the pathogenicity of
Conclusions
Supplemental Materials
Acknowledgments
The authors are highly grateful to founder President Dr. Ashok K Chauhan and Chancellor Mr. Atul Chauhan Amity University Uttar Pradesh, Noida, India for providing necessary support and facilities.
Conflict of Interest
The authors have no financial conflicts of interest to declare.
Fig 1.
Fig 2.
Fig 3.
Fig 4.
Fig 5.
-
Table 1 . List of metabolic pathways unique to
Shigella flexneri ..S. No. Pathway ID Pathway name 1 00660 C5-Branched dibasic acid metabolism 2 00680 Methane metabolism 3 00121 Secondary bile acid biosynthesis 4 00300 Lysine biosynthesis 5 00460 Cyanoamino acid metabolism 6 00473 D-Alanine metabolism 7 00540 Lipopolysaccharide biosynthesis 8 00550 Peptidoglycan biosynthesis 9 00903 Limonene and pinene degradation 10 00281 Geraniol degradation 11 00523 Polyketide sugar unit biosynthesis 12 01053 Biosynthesis of siderophore group nonribosomal peptides 13 00332 Carbapenem biosynthesis 14 00261 Monobactam biosynthesis 15 00521 Streptomycin biosynthesis 16 00525 Acarbose and validamycin biosynthesis 17 00401 Novobiocin biosynthesis 18 00362 Benzoate degradation 19 00627 Aminobenzoate degradation 20 00364 Fluorobenzoate degradation 21 00625 Chloroalkane and chloroalkene degradation 22 00361 Chlorocyclohexane and chlorobenzene degradation 23 00623 Toluene degradation 24 00633 Nitrotoluene degradation 25 00930 Caprolactam degradation 26 00626 Naphthalene degradation
-
Table 2 . List of proteins selected as novel drug targets..
S No. Accession no. Protein names Subcellular localization Novel drug targets 1 NP_710014 Elongation factor P (EF-P) Cytoplasm No Hits 2 NP_706048 Cell division protein FtsQ Transmembrane No Hits 3 NP_706769 Translation initiation factor IF-1 Cytoplasm No Hits 4 NP_707396 Translation initiation factor IF-3 Cytoplasm No Hits 5 NP_709777 Transcription termination Cytoplasm No Hits 6 NP_709088 Protein translocase subunit SecY Transmembrane No Hits 7 NP_710066 50S ribosomal protein L9 Cytoplasm No Hits 8 NP_709089 50S ribosomal protein L15 Cytoplasm No Hits 9 NP_709082 50S ribosomal protein L17 Cytoplasm No Hits 10 NP_709092 50S ribosomal protein L18 Cytoplasm No Hits 11 NP_708985 50S ribosomal protein L21 Cytoplasm No Hits 12 NP_709106 50S ribosomal protein L23 Cytoplasm No Hits 13 NP_709097 50S ribosomal protein L24 Cytoplasm No Hits 14 NP_709416 50S ribosomal protein L28 Cytoplasm No Hits 15 NP_709100 50S ribosomal protein L29 Cytoplasm No Hits 16 NP_709090 50S ribosomal protein L30 Cytoplasm No Hits 17 NP_709740 50S ribosomal protein L31 Cytoplasm No Hits 18 NP_707005 50S ribosomal protein L32 Cytoplasm No Hits 19 NP_709497 50S ribosomal protein L34 Cytoplasm No Hits 20 NP_707397 50S ribosomal protein L35 Cytoplasm No Hits 21 NP_709083 DNA-directed RNA polymerase subunit alpha Cytoplasm No Hits 22 NP_706830 30S ribosomal protein S1 Cytoplasm No Hits 23 NP_708875 30S ribosomal protein S21 Cytoplasm No Hits 24 NP_709776 Protein translocase subunit SecE Transmembrane No Hits 25 NP_706115 Elongation factor Ts (EF-Ts) Cytoplasm No Hits 26 NP_708456 Ribosome maturation factor RimM Cytoplasm No Hits
-
Table 3 . List of predicted epitopes having score value greater than threshold..
Rank Accession no. Protein Name Sequence Start position Score 1 NP_710014 Elongation factor P (EF-P) KVPLFVQIGEVIKVDTRSGE 199 0.96 2 NP_709083 DNA-directed RNA polymerase subunit alpha VILTLNKSGIGPVTAADITH 3643 0.94 3 NP_706830 30S ribosomal protein S1 VTGVINGKVKGGFTVELNGI 4018 0.93 4 NP_706048 Cell division protein FtsQ AAMTARRSWQLTLNNDIKLN 458 0.92 4 NP_709106 50S ribosomal protein L23 STAMEKSNTIVLKVAKDATK 2582 0.92 5 NP_706115 Elongation factor Ts (EF-Ts) NMRKSGAIKAAKKAGNVAAD 4827 0.91
References
- Lanata CF, Fischer-Walker CL, Olascoaga AC, Torres CX, Aryee RB. 2013. Global causes of diarrheal disease mortality in children <5 years of age: a systematic review.
PLoS One 8 : e72788. - Mandal J, Subhash MS, Parija SC, VG, Emelda. 2012. The recent trends of Shigellosis: a JIPMER perspective.
J. Clin. Diagn. Res. 6 : 1474-1477. - von Seidlein L, Kim DR, Ali M, Lee H, Wang X, Thiem VD,
et al . 2006. A multicentre study ofShigella diarrhoea in six Asian countries: disease burden, clinical manifestations, and microbiology.PLoS Med. 3 : e353. - Naheed A, Kalluri P, Talukder KA, Faruque ASG, Khatun F, Nair GB,
et al . 2004. Fluoroquinolone-resistantShigella dysenteriae type 1 in northeastern Bangladesh.Lancet Infect. Dis. 4 : 607-608. - Sivapalasingam S, Nelson JM, Joyce K, Hoekstra M, Angulo FJ, Mintz ED. 2006. High prevalence of antimicrobial resistance among
Shigella isolates in the United States tested by the National Antimicrobial Resistance Monitoring System from 1999 to 2002.Antimicrob. Agents Chemother. 50 : 49-54. - Dutta D, Bhattacharya MK, Dutta S, Datta A, Sarkar D, Bhandari B,
et al . 2003. Emergence of multidrug-resistantShigella dysenteriae type 1 causing sporadic outbreak in and around Kolkata, India.J. Health Popul. Nutr. 21 : 79-80. - Uddin R, Sufian M. 2016. Core proteomic analysis of unique metabolic pathways of
Salmonella enterica for the identification of potential drug targets.PLoS One 11 : e0146796. - Hema K, Priyadarshini VI, Pradhan D, Munikumar M, Sandeep S, Pradeep N,
et al . 2015. Identification of putative drug targets and vaccine candidates for pathogens causing atherosclerosis.Biochem. Anal. Biochem. 4 : 1. - Vetrivel U, Subramanian G, Dorairaj S. 2011. A novel in silico approach to identify potential therapeutic targets in human bacterial pathogens.
Hugo J. 5 : 25-34. - Hasan MA, Khan MA, Sharmin T, Hasan Mazumder MH, Chowdhury AS. 2016. Identification of putative drug targets in Vancomycin-resistant Staphylococcus aureus (VRSA) using computer aided protein data analysis.
Gene 575 : 132-143. - Munikumar M, Priyadarshini IV, Pradhan D, Sandeep S, Umamaheswari A, Vengamma B. 2012. In silico identification of common putative drug targets among the pathogens of bacterial meningitis.
Biochem. Anal. Biochem. 1 : 123. - Kanehisa M, Goto S, Kawashima S, Nakaya A. 2002. The KEGG databases at GenomeNet.
Nucleic Acids Res. 30 : 42-46. - Wei W, Ning LW, Ye YN, Guo FB. 2013. Geptop: a gene essentiality prediction tool for sequenced bacterial genomes based on orthology and phylogeny.
PLoS One 8 : e72343. - Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool.
J. Mol. Biol. 215 : 403-410. - Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J,
et al . 2015. STRING v10: protein-protein interaction networks, integrated over the tree of life.Nucleic Acids Res 43(Database issue) : D447-D452. - Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C,
et al . 2007. Integration of biological networks and gene expression data using Cytoscape.Nat. Protoc. 2 : 2366-2382. - Assenov Y, Ramírez F, Schelhorn SE, Lengauer T, Albrecht M. 2008. Computing topological parameters of biological networks.
Bioinformatics 24 : 282-284. - Bader GD, Hogue CW. 2003. An automated method for finding molecular complexes in large protein interaction networks.
BMC Bioinformatics 4 : 2. - Yu NY, Wagner JR, Laird MR, Melli G, Rey S, Lo R,
et al . 2010. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes.Bioinformatics 26 : 1608-1615. - Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.
J. Mol. Biol. 305 : 567-580. - Sonnhammer ELL, von Heijne G, Krogh A. 1998. A hidden Markov model for predicting transmembrane helices in protein sequences, pp. 175-182.
In: Glasgow J, Littlejohn T, Major F, Lathrop R, Sankoff D, Sensen C (eds),Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology . Menlo Park, CA, USA. - Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grane JR,
et al . 2018. DrugBank 5.0: a major update to the DrugBank database for 2018.Nucleic Acids Res 46(D1) : D1074-D1082. - Yang H, Qin C, Li YH, Tao L, Zhou J, Vu F,
et al . 2016. Therapeutic target database update 2016: enriched resource for bench to clinical drug target and targeted pathway information.Nucleic Acids Res. 44(D1) : D1069-D1074. - Huang da W, Sherman BT, Lempicki RA. 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.
Nat. Protoc. 4 : 44-57. - Huang da W, Sherman BT, Lempicki RA. 2009. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.
Nucleic Acids Res. 37 : 1-13. - Saha S, Raghava GP. 2007. Prediction methods for B-cell epitopes.
Methods Mol. Biol. 409 : 387-394. - Glas M, McLaughlin SH, Roseboom W, Liu F, Koningstein GM, Fish A,
et al . 2015. The soluble periplasmic domains ofEscherichia coli cell division proteins FtsQ/FtsB/FtsL form a trimeric complex with submicromolar affinity.J. Biol. Chem. 290 : 21498-21509. - Buddelmeijer N, Beckwith J. 2004. A complex of the
Escherichia coli cell division proteins FtsL, FtsB and FtsQ forms independently of its localization to the septal region.Mol. Microbiol. 52 : 1315-1327. - Den Blaauwen T, Andreu JM, Monasterio O. 2014. Bacterial cell division proteins as antibiotic targets.
Bioor. Chem. 55 : 27-38. - Lock RL, Harry EJ. 2008. Cell-division inhibitors: new insights for future antibiotics.
Nat. Rev. Drug Discov. 7 : 324-338.