The top-level topology of the species tree, i.e. the bacterial, archaeal and eukaryotic proteins form distinct clusters. This implies the absence of the horizontal transfer of the COG1058 domain between the three main kingdoms. The analysis of the tree also shows that the proteins in the aproteobacterial group are evolutionarily closer to the eukaryotic and archaeal proteins than to their bacterial counterparts. In addition, the d-proteobacterial group of enzymes is present in the eukaryotic branch, suggesting that it likely represents the ancestor group of the eukaryotic proteins. As to the domain composition, the two-domain bacterial proteins cluster into a homogenous group, well separated from all other forms, i.e. the bacterial, archaeal and eukaryotic single-domain enzymes and the eukaryotic two-domain enzymes. In this view, the COG1058 domain fused to NMN deamidase is evolutionarily distant from both the stand-alone and the FAD synthetase-fused form, the latter two forms being closely related to each other.COG1058 Is a Novel Pyrophosphatase FamilyFigure 8. Multiple sequence alignment of selected COG1058 proteins. Multiple alignment of representative members of the COG1058 family (full version is available in Figure 2). Positions of residues conserved in all members of the family are highlighted at the top of the alignment in magenta. The residues highlighted in green are conserved in all proteins, with the exception of the plant subfamily. Residues are JSI-124 site numbered according to the T. acidophilum protein. Proteins experimentally characterized in this work are marked by red stars. Residues mutated in the A.tumefaciens protein are marked with black asterisks. doi:10.1371/journal.pone.0065595.gMutagenesis Guided by Multiple Sequence Alignment and Structural Analysis Reveals the Identity of Catalytic ResiduesA multiple sequence alignment using a collection of 361 bacterial, 36 archaeal, and 34 eukaryotic completely sequenced genomes, annotated in The SEED database [23], revealed that the COG1058 domain is highly conserved (Figure S2). A multiple alignment of the domain in the most divergent sequences, including the AtCOG1058 and the SoCOG1058/PncC proteins characterized in this work, is depicted in Figure 8. A total of eight residues (highlighted in magenta in Figure 8) were 23148522 conserved among over 95 of the COG1058 proteins in all sequenced genomes, suggesting their likely role in protein’s function or stability. The conserved residues appear to define two putative signature motifs, namely GXEX3G and GGL/Platelet clusters might be also found not only within blood vessels IGPX3D. In addition, five residues (highlighted in green in Figure 8) were found to be conserved in all COG1058 sequences, with the exception of the plant subfamily of proteins (Figure S2). To get an insight into the possible functions of the conserved residues, we performed a structural homology search by using as the query the ?available high-resolution (2 A) crystal structure of the Thermoplasma acidophilum COG1058 protein (PDB ID: 3KBQ), as determined atthe Midwest Center for Structural Genomics (http://www.mcsg. anl.gov/). Notably, a DALI search revealed high structural similarity (Z score .10 ) with the fold of proteins from the E. coli MoCo biosynthetic pathway (MogA, the domain III of MoeA, and MobA), as well as with domains of mammalian gephyrin and plant Cnx1, which are also involved in MoCo biosynthesis [30,31,32]. The superimposition of the 3KBQ three-dimensional structure and domain III of E. coli MoeA (PDB ID: 1G8L) is shown.The top-level topology of the species tree, i.e. the bacterial, archaeal and eukaryotic proteins form distinct clusters. This implies the absence of the horizontal transfer of the COG1058 domain between the three main kingdoms. The analysis of the tree also shows that the proteins in the aproteobacterial group are evolutionarily closer to the eukaryotic and archaeal proteins than to their bacterial counterparts. In addition, the d-proteobacterial group of enzymes is present in the eukaryotic branch, suggesting that it likely represents the ancestor group of the eukaryotic proteins. As to the domain composition, the two-domain bacterial proteins cluster into a homogenous group, well separated from all other forms, i.e. the bacterial, archaeal and eukaryotic single-domain enzymes and the eukaryotic two-domain enzymes. In this view, the COG1058 domain fused to NMN deamidase is evolutionarily distant from both the stand-alone and the FAD synthetase-fused form, the latter two forms being closely related to each other.COG1058 Is a Novel Pyrophosphatase FamilyFigure 8. Multiple sequence alignment of selected COG1058 proteins. Multiple alignment of representative members of the COG1058 family (full version is available in Figure 2). Positions of residues conserved in all members of the family are highlighted at the top of the alignment in magenta. The residues highlighted in green are conserved in all proteins, with the exception of the plant subfamily. Residues are numbered according to the T. acidophilum protein. Proteins experimentally characterized in this work are marked by red stars. Residues mutated in the A.tumefaciens protein are marked with black asterisks. doi:10.1371/journal.pone.0065595.gMutagenesis Guided by Multiple Sequence Alignment and Structural Analysis Reveals the Identity of Catalytic ResiduesA multiple sequence alignment using a collection of 361 bacterial, 36 archaeal, and 34 eukaryotic completely sequenced genomes, annotated in The SEED database [23], revealed that the COG1058 domain is highly conserved (Figure S2). A multiple alignment of the domain in the most divergent sequences, including the AtCOG1058 and the SoCOG1058/PncC proteins characterized in this work, is depicted in Figure 8. A total of eight residues (highlighted in magenta in Figure 8) were 23148522 conserved among over 95 of the COG1058 proteins in all sequenced genomes, suggesting their likely role in protein’s function or stability. The conserved residues appear to define two putative signature motifs, namely GXEX3G and GGL/IGPX3D. In addition, five residues (highlighted in green in Figure 8) were found to be conserved in all COG1058 sequences, with the exception of the plant subfamily of proteins (Figure S2). To get an insight into the possible functions of the conserved residues, we performed a structural homology search by using as the query the ?available high-resolution (2 A) crystal structure of the Thermoplasma acidophilum COG1058 protein (PDB ID: 3KBQ), as determined atthe Midwest Center for Structural Genomics (http://www.mcsg. anl.gov/). Notably, a DALI search revealed high structural similarity (Z score .10 ) with the fold of proteins from the E. coli MoCo biosynthetic pathway (MogA, the domain III of MoeA, and MobA), as well as with domains of mammalian gephyrin and plant Cnx1, which are also involved in MoCo biosynthesis [30,31,32]. The superimposition of the 3KBQ three-dimensional structure and domain III of E. coli MoeA (PDB ID: 1G8L) is shown.
http://cathepsin-s.com
Cathepsins