lubica.urbanikova@savba.sk
The present in-silico study is focused on trehalose synthases (TreS), enzymes converting maltose to trehalose and vice versa [1], which (at least some of them) may be part of the four-step metabolic pathway of glycogen formation from trehalose. They are classified in the Carbohydrate-Active enZymes Database (CAZy; http://www.cazy.org) [2] in the family of glycoside hydrolases GH13, known as the main α-amylase family [3,4]. Family GH13 covering more than 30 different specificities [5] have been divided into 44 official subfamilies [6] and trehalose synthases belong to subfamily GH13_16. Typically, they consist of the three-domain family GH13 canonical arrangement with a catalytic (β/α)8-barrel domain A, domain B (mostly of irregular structure) protruding out of the barrel in the place of the loop 3 and domain C (a 7-stranded antiparallel β-sandwich) at the C-terminus [4]. In some GH13_16 enzymes, however, the domain C is succeeded by a C-terminal extension which in many cases exhibits clear sequence features of a maltokinase (MaK) [7,8]. True MaKs are single-domain enzymes that catalyze ATP-dependent phosphorylation of maltose at position 1 [8].
One of our goals was analysis of GH13_16 enzymes for the presence of maltokinase domain. Hence of total 5,933 GH13_16 members available (October 14, 2021), a set of 3,325 unique sequences with a standard TreS domain was retrieved. These were subsequently divided into two main groups: (i) 1,425 fused TreS-MaKs with a long C-terminal extension (at least 400 residues) where the full-length MaK domain was detected; and (ii) 1,900 simple TreSs with a standard TreS domains followed by a short C-terminal extension (< 110 residues) where no additional domain was found.
The sequences of MaK domains of fused TreS-MaKs and 17 characterized true MaKs (i.e. those without a GH13_16 TreS) were aligned, conserved sequence regions were identified and members with identical CSRs were excluded. Similarly, also simple TreSs were aligned and those with identical CSRs (seven CSRs typical for GH13 family) were excluded. As a result, 604 fused TreS-MaKs and 597 simple TreSs were selected for further study. Detailed analysis revealed that only 467 MaK domains from 604 fused TreS-MaKs may represent standard MaKs with conserved catalytic machinery. In contrary, mutations in residues directly binding maltose or in catalytic aspartates were found in 79 and 58 MaK domains, respectively. Their proper catalytic function as maltokinases is thus questionable which opens up speculations about their possible new role. The group of 597 simple TreSs was used to prepare a logo based on CSRs, and as a control set in the aim to find possible differences between TreS domains of simple TreS and fused TreS-MaK enzymes. Analysis of the linkers connecting the TreS and MaK domains revealed the unusual high content of aromatic residues, tryptophans and phenylalanines, in comparison with C-terminal parts of simple TreS enzymes. The tertiary structure of any fused TreS-Mak is currently unknown, therefore the structures modeled by Alphafold [9] and available in UniProt database [10] were analyzed with a special focus on the parts connecting the two domains. The results support the hypothesis that part of the linker actually belongs to the MaK domain. Thus, the Mak domains of TreS-MaK fused enzymes are larger than those of true MaKs.
Since the presented work delivers a detailed bioinformatics analysis of fused TreS-MaK enzymes for the first time, it might help in their quick identification and contribute to better characterization of both TreSs and MaKs, and thus help in the study of the role of MaK/MaK-like domains in the subfamily GH13_16 enzymes.
This work was financially supported by the grant No. 2/0146/21 from the Slovak Scientific Grant Agency VEGA.