Structure Based Protein Engineering of Aldehyde Dehydrogenase from Azospirillum brasilense to Enhance Enzyme Activity against Unnatural 3-Hydroxypropionaldehyde

3-Hydroxypropionic acid (3HP) is a platform chemical and can be converted into other valuable C3-based chemicals. Because a large amount of glycerol is produced as a by-product in the biodiesel industry, glycerol is an attractive carbon source in the biological production of 3HP. Although eight 3HP-producing aldehyde dehydrogenases (ALDHs) have been reported so far, the low conversion rate from 3-hydroxypropionaldehyde (3HPA) to 3HP using these enzymes is still a bottleneck for the production of 3HP. In this study, we elucidated the substrate binding modes of the eight 3HP-producing ALDHs through bioinformatic and structural analysis of these enzymes and selected protein engineering targets for developing enzymes with enhanced enzymatic activity against 3HPA. Among ten AbKGSADH variants we tested, three variants with replacement at the Arg281 site of AbKGSADH showed enhanced enzymatic activities. In particular, the AbKGSADHR281Y variant exhibited improved catalytic efficiency by 2.5-fold compared with the wild type.


Introduction
3HP is one of the top building block chemicals issued in 2004 and 2010 by the United States Department of Energy. As a platform chemical, 3HP can be converted into other valuable C3-based chemicals, such as acrylic acid, acrylamide, malonic acid, and other 3HP-or acryl-based polymers [1][2][3][4]. Several chemical approaches to produce 3HP industrially using oxidation from 1,3-propanediol or 3HPA, and hydration from acrylic acid have been reported [5,6]. However, several advantages associated with free of petroleum-based raw materials, low cost, and environmental issues are driving the production of 3HP through biological methods [7].
For the biological production of 3HP, two biosynthetic pathways have been developed using glucose or glycerol as carbon sources. In the glucose pathway, 3HP can be produced by a carbon fixation pathway such as the 3HP/ 4HB cycle [8][9][10][11]. In the glycerol pathway, glycerol dehydratase converts glycerol to 3HPA, and then ALDH converts 3HPA to 3HP [12,13]. In recent years, a large amount of glycerol has been produced as a byproduct in the biodiesel industry, therefore glycerol is attracting attention as a suitable carbon source for 3HP production [14,15].
Eight 3HP-producing ALDH enzymes have been reported to date, including BsDhaS, CnGapD4, EcAldH, KpPuuC, KpYdcW, KpYneI, ScAld4, and AbKGSADH (Table S1) [16][17][18][19][20][21][22]. Among these, only one crystal structure of AbKGSADH has been reported, and this report explained how AbKGSADH stabilize to catalyze various chemicals such as α-ketoglutaric semialdehyde, succinic semialdehyde, and 3-HPA [23]. The structural information was also utilized to enhance its reactivity [24]. However, enzyme reactivity of ALDH for the conversion of 3HPA to 3HP is still low, and the activity imbalance in between DhaB and ALDH causes toxic 3HPA to accumulate in the cell. Moreover, the high expression of ALDH enzyme can be a burden on the host strain, therefore ALDH increasing activity by enhancing 3HPA specificity has been required [25].
Here, we elucidate the unique substrate specificity of ALDH enzymes by bioinformatic analysis. Crystal structure of AbKGSADH, structure homology modeling, and molecular docking simulation revealed unnatural 3HPA binding modes of eight 3HP-producing ALDHs, and various amino acid residues are positioned in each substrate binding pocket of ALDHs. Based on this information, ten AbKGSADH variants were constructed and several variants with enhanced activity were obtained successfully. The results from this study could be utilized for the improved production of 3HP.

Site-Directed Mutagenesis and Enzyme Preparation
Site-directed mutations were performed using a QuikChange kit (Agilent, USA), and mutated nucleotide sequences were confirmed. AbKGSADH enzyme was prepared with same method as described in our previous study [23]. AbKGSADH wild type and variants sub-cloned into pProEX-HTa vector (Thermo Fischer Scientific, USA) were transformed into an E. coli BL21(DE3)-T1 R strain, and each strain was cultured to an OD 600 of 0.6 in fresh LB medium with 100 mg l -1 ampicillin at 37 o C. AbKGSADH protein expression was induced by 0.5 mM IPTG. After 20 h at 18 o C, the cells were harvested by 4,000 g for 15 min at 4 o C. The cells were resuspended in icecold 40 mM Tris-HCl, pH 8.0, and disrupted by ultrasonication. Cell debris was removed by centrifugation at 13,000 g for 30 min, and the lysate was applied onto a Ni-NTA agarose column (Qiagen, Germany). After washing with 40 mM Tris and 25 mM imidazole, pH 8.0, the AbKGSADH protein was eluted with 40 mM Tris and 300 mM imidazole, pH 8.0.

Enzyme Activity Assay
The activity of AbKGSADH wild type and variants was determined by measuring the increase of absorbance at 340 nm. Enzyme reaction was performed with a reaction mixture of 0.5 mM total volume at 25 o C. For kinetic analysis, reaction mixtures contained 100 mM Tris-HCl, pH 8.0, 500 μM NAD, 50 nM AbKGSADH enzyme, and various concentrations of 3HPA (0.5-10 mM). The reactions were initiated by the addition of enzyme. All reactions were performed in duplicate. The initial velocity of each measurement was calculated with extinction coefficient of NADH (6.22 M -1 cm -1 ). Statistical analysis for K M and k cat values was performed by Michaelis Menten models using OriginPro software (OriginLab, USA).

Unnatural Substrate Availability of ALDHs
According to the enzyme nomenclature databases operated by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology, ALDHs belonging to the EC 1.2.1.-group (oxidoreductases acting on the aldehyde-or oxo-group of donors with NAD or NADP as acceptor) are divided into 107 classes (1.2.1.1-107) based on their specific activity [31]. Among these 107 classes, 2 classes have been deleted, 12 classes have been transferred to another entry, and 13 classes have no annotated nucleotide/amino acid sequence data in the database. Another 34 classes have amino acid sequences that refer to their entries, but no crystal structure has been reported to verify their structural fold and elucidate detailed molecular mechanisms and catalyses. Finally, the selected crystal structures of the remaining 46 different ALDH classes were compared (Tables S3, S4). Interestingly, when the selected crystal structures were superimposed on each other, approximately 70% of the ALDH classes displayed a similar overall structure (ALDH fold) and the remaining ALDHs have the GapDH fold (15%) or their own unique shape (15%) (Tables S3, S4). Moreover, many enzymes with the ALDH fold are multi-classified in the EC 1.2.1.-category. These observations indicate that most ALDH enzymes have similar structural conformations, and differences in only several amino acids involved in the constitution of the substrate binding pocket cause ALDH enzymes to accommodate different substrates. Due to these structural properties, ALDH enzymes have relatively broad substrate specificities, which in turn allow some ALDH enzymes to utilize unnatural aldehyde chemicals as substrates.

Unnatural 3HPA Binding Mode Prediction
The oxidation reaction from 3HPA to 3HP is a key step in the efficient production of 3HP from glycerol. However, because enzymes utilizing the unnatural 3HPA as a substrate have not yet been reported in natural organisms, use of the unique enzyme property of the ALDH enzymes with a broad substrate specificity is crucial for the enzymatic conversion of 3HPA to 3HP. Although eight 3HP-producing ALDHs have been reported to date, their low activities against 3HPA still remain a bottleneck for highly efficient 3HP production [16][17][18][19][20][21][22]. Here, an understanding of the substrate binding sites of the eight ALDHs and detailed structural comparisons of these ALDHs can be considered essential for the development of the ALDH enzymes with high activity against 3HPA. Because the crystal structure was reported only in AbKGSADH among the eight 3HP-producing ALDHs [23], we first attempted to obtain the three-dimensional structure of the seven other ALDHs through homology modeling. We used the known structure with the highest amino acid homology to each ALDH as a template model for more accurate structural prediction, and all seven modeled structures had reasonable model quality estimation scores (Table S2).
We also performed the molecular docking simulations of the 3HPA molecule into the eight 3HP-producing ALDHs to identify the 3HPA binding mode of these enzymes (Fig. 1). Although the 3HPA molecule was positioned in a slightly different mode in each enzyme, the aldehyde group where the catalysis occurs was directed toward the catalytic residues in all eight enzymes. When we compared the structures of the eight 3HP-producing ALDHs, the two catalytic residues, Glu253 and Cys287 in AbKGSADH, were completely conserved in all eight enzymes, indicating that these enzymes catalyze the reaction in an identical manner (Fig. 1). We then selected ten residues located in the vicinity of the 3HPA molecule and compared these residues in eight 3HP-producing ALDHs. Among eight enzymes, the EcAldH and KpPuuC enzymes, with 83% sequence identity, contained all ten residues, and the other six enzymes contained variable residues (Fig. 2). Of the ten selected residues, three residues, Phe156, Val286, and Phe450 in AbKGSADH, were highly conserved in the 3HP-producing ALDHs, with the exception of KpYneI, which contained Tyr156, Asp285, and His450 residues, respectively. Seven other residues, including Ser109, Asn159, Gln160, Arg163, Arg281, Ile288, and Pro444 in AbKGSADH, were variable in the 3HP-producing ALDHs (Fig. 2). In particular, three residues, Ser109, Ile288, and Pro444 in AbKGSADH, were highly variable, so that a dominant residue was not found (Fig. 2).

Protein Engineering of AbKGSADH to Improve 3HPA Utilization
As we described above, the residues involved in stabilization of the 3HPA substrate are quite diverse among the known 3HP-producing ALDHs, and these variable residues can be potential targets for enzyme engineering for more efficient 3HP production. Of ten residues involved in substrate binding, we selected four relatively more variable residues, Ser109, Arg281, Ile288, and Pro444 in AbKGSADH, for target engineering sites. We then replaced these four residues of AbKGSADH with the various corresponding residues located in seven other 3PHproducing ALDHs, and generated the following ten AbKGSADH variants: AbKGSADH S109G , AbKGSADH S109L , AbKGSADH R281Y , AbKGSADH R281F , AbKGSADH R281Q , AbKGSADH I288T , AbKGSADH P444F , AbKGSADH P444G , AbKGSADH P444L , and AbKGSADH P444S .
We measured the specific activities and kinetic parameters of these ten variants and compared them with the same parameters for AbKGSADH WT . The AbKGSADH S109G and AbKGSADH S109L variants showed somewhat decreased activities compared with AbKGSADH WT (Fig. 3A). The kinetic parameters of these two variants indicate that the decreased activities are due to decreased k cat values rather than decreased substrate affinity ( Table 1). The phenomenon was more dramatic in the AbKGSADH S109G variant, and we suspect that replacement  of Ser109 by a small hydrophobic glycine increased the binding affinity for 3HPA, however, the altered binding conformation to the molecule by the replacement severely hampered the conversion ability of the variant. The AbKGSADH I288T variant showed only 28% activity compared to the wild type, and kinetic analysis of the variant revealed that decreased activity was due mainly to decreased k cat value ( Fig. 3A and Table 1). In the case of the variants mutated at the Pro444 site, all four variants showed decreased activities compared with the wild type (Fig. 3A). Kinetic analysis of the four variants indicated that the levels of decreased activities were almost proportional to the levels of decreased k cat values, with the K M values similar to each other (Table 1). Based on these observations, we conclude that three residues, Ser109, Ile288, and Pro444 of AbKGSADH, provide a more suitable conformation for 3HP binding and enzyme catalysis compared with the residues possessed by other 3HPproducing enzymes.
Interestingly, the three variants mutated at the Arg281 position of AbKGSADH exhibited substantially increased catalytic efficiencies. The AbKGSADH R281F variant displayed a k cat value almost identical to AbKGSADH WT , but displayed a dramatically decreased K M value, resulting in 23% enhanced catalytic efficiency for this variant. The AbKGSADH R281Q variant displayed higher values in both k cat and K M , resulting in a 16% improvement in catalytic efficiency for this variant. The most remarkable improvement was observed in the AbKGSADH R281Y variant. This variant showed 2.5-fold increased catalytic efficiency with approximately half the K M value and 24% higher k cat value compared with AbKGSADH WT , indicating that the enhanced catalytic efficiency was due to both the higher substrate affinity and increased conversion rate to product (Table 1 and Figs. 3A and 3B). Based on these results, we believe that the location of an amino acid with a bulky or ring sidechain, such as Arg, Gln, Phe, and Tyr, at the position of Arg281 of AbKGSADH is crucial for stabilization of the 3HPA substrate. In particular, the tyrosine residue appears to be most optimal for the unnatural 3HPA substrate, and in fact, half of the 3HP-producing ALDHs have the tyrosine residue at this site.
In summary, bioinformatic analysis of ADLHs belonging to the EC 1.2.1.-category shows that most ALDHs have the same overall fold, and their substrate specificity depends on amino acid residues involved in the formation of the substrate binding pocket. ALDH enzymes with a conventional ALDH fold can utilize a wide range of aldehyde chemicals as substrates, indicating that unnatural aldehydes can also be catalyzed by these ALDHs. Structure homology modeling and molecular docking simulation allowed us to identify detailed substrate binding modes of the eight 3HP-producing ALDHs reported to date and determine amino acids that can be protein engineering targets to improve utilization of the unnatural 3HPA substrate. Of the ten AbKGSADH variants, three variants in which the Arg281 site of AbKGSADH was mutated showed enhanced 3HPA utilization ability. In particular, AbKGSADH R281Y exhibited improved catalytic efficiency by 2.5-fold. This study offers valuable information in the pursuit of highly efficient 3HP production.