A distinct molecular signature on anhydrobiotic cyanobacterial metallothioneins

Anhydrobiosis refers to a state of suspended animation in which some organisms enter when exposed to extreme desiccation, ensuring them an outstanding tolerance to several physical stresses due to molecular and cellular adaptations. Metallothioneins (MTs) are short cysteine-rich metal-chelating proteins that work as a cellular protection element in metal ion-rich conditions. Here we aimed to investigate possible molecular signatures in primary and tertiary structures in anhydrobiotic cyanobacterial MTs. Anhydrobiotic and non-anhydrobiotic cyanobacterial MT amino acid sequences were retrieved from NCBI database and aligned in Clustal Omega server. Additionally, the amino acid compositions of these sequences were determined by GeneRunner. Further, we carried out homologymodeling via SWISS-MODEL, structural superposition in UCSF Chimera 1.4 Matchmaker tool and ligand-binding site prediction via COFACTOR. In silico analyses revealed specific divergences in amino acid positions between MT groups, evidencing positive and negative selections, however without affecting final protein structures. Some of these changes on polypeptide sequence potentially enhance protein stabilization during desiccation, whereas others possibly act as additional metal-ion coordinating residues. Analyses on the molecular adaptations on anhydrobiotic cyanobacterial MTs help shed light on their molecular functions and biological roles, as well as may have applications on the development of desiccationand metal-tolerant organisms.


Introduction
Anhydrobiosis (life without water) is a reversible ametabolic state of suspended animation, in which some (anhydrobiotic) organisms enter when exposed to extreme desiccation and can withstand osmotic imbalances for decades (Hibshman et al., 2020). In addition, desiccated anhydrobiotic organisms display tolerance to several physical stresses, such as extreme temperatures, high hydrostatic pressures and ionizing radiation (Hibshman et al., 2020). To ensure the maintenance of dry cells under these harsh conditions, anhydrobiosis-associated genes entails protective molecular adaptations that include the synthesis of non-reducing disaccharides, accumulation of late embryogenesis abundant (LEA) proteins, the activities of chaperones and repair enzymes (after rehydration), among other elements (Schill et al., 2009). The study of desiccation tolerance is important for different areas, including seed conservation (Voss et al., 2020).
Metallothioneins (MTs) are short-chain (near 50 amino acids) metal-binding proteins with a high amount of cysteine residues, which are responsible for strong interactions with metal ions (Capdevila & Atrian, 2011), such as Zinc (Zn 2+ ) and Copper (Cu). Several studies have revealed the roles of MTs in physiological metal homeostasis, xenobiotic metal detoxification, antiapoptotic activities and protection against reactive oxygen species (ROS) (Capdevila & Atrian, 2011).
SmtA, the MT-encoding gene from the cyanobacterium Synechococcus sp. PCC 7942, was sequenced and characterized in the early 1990s, leading to the discovery of the first family of bacterial MTs (Blindauer, 2011). Although the identification of putative MT homologs is challenging due to the small size and unusual composition of these proteins, cloning technologies have allowed the identification of significantly similar proteins to SmtA (Zúñiga et al., 2019), encouraging studies on molecular evolution of cyanobacterial MTs.
Considering an extreme desiccation scenario, in which water is scarce and there is a plethora of salts and metal ions, we hypothesized that MTs have an important protective role in osmotic balance and metal detoxification in anhydrobiotic cells since these metal-chelating proteins are induced in anhydrobiotic specimens under drought conditions (Collett et al., 2004;Lopez-Martinez et al., 2009). In order to investigate possible structural differences in anhydrobiotic MTs, we carried out primary sequence analyses, tridimensional structure modelling, and prediction of interactions with metal ions. Due to the high heterogeneity of MT primary sequences, we focused on cyanobacterial MTs because of their conserved secondary and tertiary structures (Blindauer, 2011).

Methodology
The present study can be characterized as a laboratory research (in silico) with qualitative and quantitative approaches (Pereira et al., 2018).

Sequence retrieval and analyses
Ten cyanobacterial MT amino acid sequences (5 from anhydrobiotic and 5 from non-anhydrobiotic organisms) were retrieved from the NCBI database: non-anhydrobiotic organisms -Pleurocapsa sp. CCALA 161 (NCBI ID: WP_081593752.1). Evidence on the anhydrobiotic nature of these cyanobacteria is provided in supplemental references (electronic supplementary material). Those species for which we found no evidence of anhydrobiosis or desiccation tolerance were considered non-anhydrobiotic. Bioinformatics analyses were carried out according to (Bhasin & Raghava, 2006).
The Clustal Omega server (Larkin et al., 2007) was used to perform multiple sequence alignments between MTs with default settings. Sequence alignments were analyzed via JalView (Procter et al., 2021) and the amino acid composition of each sequence was determined via GeneRunner (Spruyt & Buquicchio, 1994).

Protein structure analyses
First, we used the MT sequences as queries to search for similar structures deposited in the Protein Data Bank (PDB) using the mmseqs2 (many-against-many sequence searching) method. Then, the most similar PDB structure to the cyanobacterial MTs (both anhydrobiotic and non-anhydrobiotic) was retrieved and used as a template for protein structure homology-modelling via SWISS-MODEL (Waterhouse et al., 2018).
In order to assess the structural quality of the generated models for tertiary structures, we used open-source validation tools, such as ERRAT, ProSA-web and PROCHECK. ERRAT calculates an overall quality factor by comparing the nonbonded atomic interactions of the predicted structure model with high-resolution structures. ProSA-web recognizes errors based on experimental and theoretical protein structure models, providing a Z-score that indicates the overall model quality.
PROCHECK compares the geometry of the residues in a given protein structure model with stereochemical parameters of high-resolution structures, generating a Ramachandran plot showing the number of residues in favored, allowed, and disfavored regions. Each modelled structure was submitted to side-chain refinement and overall structure relaxation via GalaxyRefine web server; and then structural quality of the refined models was assessed by the same programs mentioned above.
To investigate any functional relationship between our predicted structures and the experimentally-determined PDB structure (template), we carried out protein superposition by using UCSF Chimera 1.4 Matchmaker tool (Pettersen et al., 2004), which provides an RMSD (root-mean-square deviation) calculation for each structural superposition. The RMSD cutoff was set up to 2.5 Å (angstrom) to determine similar protein structures, according to (Tsai et al., 2004).
Finally, for each modelled structure, we carried out the ligand-binding site prediction via COFACTOR (Zhang et al., 2017) by searching a query protein against three independent sequence (UniProt-GOA), structure (BioLiP), and PPI-function (STRING) libraries. This open-source software provides a confidence score of the predicted binding site, Cscore LB , which Research, Society and Development, v. 10, n. 2, e50610212714, 2021 (CC BY 4.0) | ISSN 2525-3409 | DOI: http://dx.doi.org/10.33448/rsd-v10i2.12714 4 values range from 0 to 1. In order to select the predicted binding-sites, a Cscore LB cutoff was set to equal or above 0.7, which means that more than 75% of the binding sites are predicted correctly, according to (Zhang et al., 2017).

Primary sequence analyses
Multiple sequence alignments of non-anhydrobiotic and anhydrobiotic cyanobacterial MTs are shown (Figures 1A,   B). As expected, the majority of cysteine residues were shown to be fully conserved in non-anhydrobiotic and anhydrobiotic cyanobacterial MTs -including those metal-coordinating cysteines (Cys-9, Cys-14, Cys-32, and Cys-36) described in SmtA primary sequence (Blindauer, 2011). However, other metal-coordinating residues which display an aromatic ring, have been conserved separately; for example, a histidine residue (His-49) in anhydrobiotic MTs, which is described as highly variable (sometimes absent) among bacteria (cyanobacteria, alphaproteobacteria, gammaproteobacterial, and firmicutes), but still important for the coordination of zinc (Zn 2+ ) ions by its imidazole nitrogen (Blindauer, 2011).
Furthermore, conserved non-coordinating residues were also observed in one specific group, or both. For example, Tyr-31 and Ala-37, which stabilize the structural fold via a CH-π interaction (Blindauer, 2011), are fully conserved in both groups. On the other hand, specially conserved in anhydrobiotic cyanobacterial MTs, serine residues (Ser-19 and Ser-33) can provide an enhanced molecular stability (Emoto et al., 1996). Such stabilizing effect was described in Pseudomonas fluorescens (Cd4PflQ2 MT), in which Ser29 has an essential role as an N-cap for a short alpha-helix, forming two hydrogen bonds, one from its carbonyl oxygen to the amide proton of Ala33 and a second one from the side-chain oxygen to the amide proton of Ala31 (Habjanič et al., 2020). However, most of these conserved non-coordinating residues (V4, T5, M7, K8, A10, L15, V18, E21, A23, K26, D27, K29, Y32, G39, G46, and G51) do not have well characterized functions in MT. Therefore, we believe that further investigations on these strongly conserved residues here identified may help deepen our understanding on MTs activities, both on anhydrobiotic as well as on desiccation-sensitive organisms.
Similar to the multiple sequence alignments results, the amino acid composition of cyanobacterial MTs did not reveal significant differences in the percentages of metal-coordinating residues (Figure S1 in electronic supplementary material). A slightly increased percentage of cysteine was observed in non-anhydrobiotic cyanobacterial MTs that can be pointed out, for example, by the amino acid substitution S33C, in which cysteine appears partially conserved (80%) in this group. This common residue substitution relies on a change of a single oxygen atom by a sulfur, which positively influences the coordination of cadmium (Cd), due to the involvement of Cys-33 in metal-binding reactions (Habjanič et al., 2020), without affecting the MT structure. However, as mentioned by (Emoto et al., 1996), this Ser-to-Cys substitution results in an overall destabilization. Thus, we hypothesize this conservation of Ser-33 ( Figure 1D) to be important in anhydrobiotic scenarios, where an organism faces homeostatic imbalances and other threats to cellular dynamics. Therefore, despite the strong similarities of cysteine residues (i.e., conservation and percentage) between the groups, we have revealed that anhydrobiotic cyanobacteria display some signatures in their MT primary structures that may have unknown functions in the molecular dynamics of these proteins. Additionally, anhydrobiotic cyanobacterial MTs have more fully conserved residues in comparison to non-anhydrobiotic cyanobacterial MTs (52% and 35%, respectively)an increase of 48% (residue conservation index described in electronic supplementary material) -, which may suggest a better arrangement of MT amino acids in anhydrobiotic cyanobacteria, ensuring a more precise functioning in metal-rich environments. Sequences of non-anhydrobiotic and anhydrobiotic cyanobacteria are represented in ( Figure 1A) and ( Figure 1B), respectively. By selecting the fully identical residues (highlighted in different colors) between sequences, we revealed those amino acids exclusively conserved in non-anhydrobiotic cyanobacteria (pointed out by red circles), in anhydrobiotic cyanobacteria (pointed out by blue triangles), and the one that has been replaced by a single specific residue (pointed out by yellow star). Black-edged rectangles highlight the fully conserved cysteine residues in both groups. ( Figure 1C) Superposition of SmtA NMR structure and the predicted MT structure of C. thermalis displaying all the Zn-coordinating residues. ( Figure   1D) Highlighting an amino acid (Ser-33) strongly conserved in anhydrobiotic cyanobacteria MTs and sparse in nonanhydrobiotic cyanobacteria MTs.
Furthermore, the ligand-binding site predictions of these modelled proteins clearly showed that, alike SmtA (1JJD), they all bind preferentially to Zn 2+ metal ions, specifically on Cys-9, Cys-14, Cys-32, and Cys-36 residues (supplementary   table S3). Also, the COFACTOR analysis showed the feasibility of interaction between Cys-11, His-40, and Cys-53 residues and Zn 2+ , which is also experimentally described by (Blindauer, 2011). However, other experimentally-determined Zncoordinating residues were not predicted, such as and Cys52. Together, these results from structural superposition and ligand-binding site predictions provided additional evidence that the proteins analyzed in the present study are, in fact, cyanobacterial MTs. This is important given that most of these proteins do not have experimental validation, only in silico characterization (i.e., genomic classification and annotation).
Moreover, several peptide sequences deposited on common databases are misannotated as putative MTs, but without the striking characteristics of a "bona fide MT" (i.e., high cysteine content, relatively short chains) (Blindauer, 2011).
Although the microrganisms herein studied are evolutionarily close, we report divergences of residues in their MTs, which do not influence directly the coordination of metal ions or their tertiary structures, but rather possibly their molecular dynamics.
From another perspective, a transcriptome study of the anhydrobiotic nematode Caenorhabditis elegans (Hashimshony et al., 2015) revealed a 60-fold up-regulation of mtl-1 (metallothionein-like 1) expression strictly in an alternative developmental stage designated dauerthe only anhydrobiotic developmental stage of C. elegans (Erkut et al., 2011). Thus, this increased level of MT in C. elegans dauer larvae may contribute to cell maintenance under stressful conditions associated with anhydrobiosis.

Protein structure analyses
After the search in the PDB for well-resolved homologous structures to MT primary sequences, we selected the NMR structure of cyanobacterial MT SmtA (PDB ID: 1JJD) (Chatterjee et al., 2020) as the most indicated template for homologymodelling, because sequence alignments between our MT query sequences and SmtA resulted in appropriate identity, similarity, and coverage percentages for this approach (supplementary table S1). We then modelled the tertiary structures of ten cyanobacterial MTs, which provided highly similar models to 1JJD (tables S1 and S2 in electronic supplementary material). As illustrated in Figure 1C, predicted MT structures showed similar Zn-coordination by using the same coordinating-residues of the NMR structure SmtA (i.e., . However, the non-anhydrobiotic Pleurocapsa sp. and H. patteloides MTs did not display metalcoordination by His-49, because of their lack of such residue (see Figure 1A).

Conclusion
Therefore, by integrating our data with MT expression profiles from other studies (Collett et al., 2004;Lopez-Martinez et al., 2009;Hashimshony et al., 2015), we propose that the molecular advantages associated with MT performance in anhydrobiotic organisms may have occurred due to (i) substitutions on polypeptide sequence without affecting the number of coordinating cysteine residues or the tertiary structure and (ii) changes on the control of gene expression, increasing MT  Further studies on the survival of anhydrobiotic organisms displaying different allelic versions of metallothioneinencoding gene, generated through gene editing technologies such as CRISPR, or its complete knockout, may be conducted in order to gather additional evidence to support the protective role of these metal-chelating proteins in desiccating conditions.