To determine if we could identify changes in euFUL sequences or selection that might shed light on this change in function, we analyzed euFUL gene evolution in Solanaceae. We performed a maximum likelihood phylogenetic analysis on a data set that consisted of 106 Solanaceae members of the euFUL gene lineage , which we obtained through amplification and sequencing , generating transcriptome sequence data , or mining databases . As outgroup we used 10 euFUL genes from Convolvulaceae, the sister family to Solanaceae . The resulting tree shows the two major clades of core-eudicot euFUL genes, the euFULI and euFULII lineages . Within each of these clades there is evidence of a Solanaceae-specific duplication, resulting in two subclades in each lineage. Within each subclade, the order of branches correlates well with the topology of the Solanaceae phylogeny ; discrepancies at the genus level are likely due to the short length of some sequences and sequence divergence in some taxa. Each of the subclades includes orthologs from both fleshy- and dry-fruited species, indicating that the subclade duplications preceded the origin of fleshy fruit. Although duplications in these genes are common , we did not find significant evidence of taxon-specific duplications. We did, however, find two genes that did not fall into a specific subclade. A third Streptosolen gene grouped sister to the rest of the euFULI clade , potentially the result of a taxonspecific duplication followed by sequence divergence. In addition, a Schizanthus gene grouped sister to the euFULII clade . This may also be a divergent genus-specific paralog, but since Schizanthus is one of the earliest diverging genera , square plastic pot it is also possible this gene might be a remaining paralog from the reported whole genome duplication/triplication that occurred early in Solanaceae diversification .
We also found potential evidence of loss – not every Solanaceae species we studied had a copy of each euFUL gene. We did not, for example, find FUL2 genes in Iochroma, Fabiana, Solandra, Juanulloa, Schizanthus, or Goetzia, even though these all had genes in the FUL1 clade . However, although this may represent paralog loss, it is possible we did not recover all gene copies due to PCR primer mismatches, low expression levels, or the absence of transcript in the sampled tissue. In addition to the major shift to fleshy fruit in the Solanoideae subfamily, fleshy fruits have independently evolved in Cestrum and Duboisia, and there has also been a reversal to a dry fruit in Datura . Our analysis does not include genes from Duboisia, but the euFUL genes from Cestrum and Datura grouped in positions in the tree that were expected based on their phylogenetic position, and did not show any notable differences in sequence from the euFUL genes of their close relatives.We compared dN/dS ratios between and among Solanaceae euFULI and euFULII lineages, as well as between sequences before and after the transition to fleshy fruit, to investigate if any changes in selection might be correlated with sequence diversification. All ω values from our analyses are closer to 0 than to 1 , which indicates that all euFUL gene clades are under strong purifying selection . Studies suggest that this is the norm for most protein coding genes, and that under such stringent evolutionary constraints, slight differences in evolutionary rates may result in functional diversification . Our data show a weakening of purifying selection in FUL1 genes relative to FUL2 genes and in MBP10 genes relative to MBP20 genes . Immediately after the euFULI duplication, the FUL1 and FUL2 lineage genes would have been fully redundant, which might have allowed the reduction in purifying selection on the FUL1 genes resulting in potential functional divergence.
Similarly, the duplication that resulted in the two euFULII gene clades would have resulted in redundancy in the MBP10 and MBP20 lineages, possibly allowing the more rapid diversification of MBP10 genes. Although studies indicate that the euFULI genes of tomato have novel functions compared to those in dry fruit , it remains unclear whether the new functions are the result of changes in coding sequences, regulatory regions, or downstream gene targets. Our analysis shows that euFUL genes in both dry- and fleshy-fruited species are evolving at similar rates . This suggests conservation of the coding sequences in both fleshy- and dry fruited species despite the central roles in the development of these distinct fruit morphologies. Sixty-four of the sequences in our analysis were from fleshy fruited species whereas only 42 were from dry-fruited species. Although, we had broad representation across the dry grade, it is possible with additional representation from dry fruited species, more evolutionary patterns would be revealed .An analysis of selection across an entire sequence may indicate different types of selection for the whole gene, but this overlooks the fact that key residues may be undergoing rapid evolution that may result in functional changes . Other empirical studies have further described functional changes due to a change in a single amino acid residue specifically associated with changes in polarity or conformation . Studies in A. thaliana, show that a single amino acid mutation in GLABRA1 results in the inhibition of trichome formation and a change of a single residue is sufficient to convert the function of TERMINAL FLOWER 1 , which inhibits flower formation, to that of the closely related FLOWERING LOCUS T , which promotes flowering . Three-dimensional modeling has also shown that a single amino acid change in a highly conserved domains may lead to changes in protein–protein interactions .
We searched for individual sites in the predicted amino acid sequences that showed evidence of positive selection within the gene groups that, although under purifying selection, were found to have statistically significantly accelerated evolutionary rates to determine if any amino acid changes at these sites had the potential to result in a change in protein function. Our findings show that more residues are rapidly changing in the K domain compared to the M and I domains . The K domain is predicted to have an α-helix structure that facilitates protein–protein interactions . The α-helix structure depends on conserved hydrophobic residues spaced through the domain . Therefore,changes to protein residues that alter charge and/or conformation in this region can lead to changes in such interactions. Most of the rapidly evolving sites did not show an amino acid change specifically associated with the shift to fleshy fruit, but rather showed changes and reversals over the course of gene evolution. Interestingly, in the FUL1 proteins, we found one site in the K domain, corresponding to the 153rd residue in the tomato protein , at which 11 out of 15 sequences from dry-fruited species have a negatively charged glutamate residue. In comparison, 100% of the fleshy clade contains a nonpolar residue: valine or methionine . However, since the remaining four FUL1 sequences from dryfruited species have non-polar glutamine or V at this site, the change from charged to non-polar is not associated with the shift to fleshy fruit. In addition, a PROVEAN analysis predicted the changes at this site to be neutral with regards to function. Two other sites in the FUL1 K domain show changes that are predicted to have functionally deleterious consequences according to our PROVEAN analysis . These include a charged histidine to a non-polar glutamine/asparagine transition at the 95th residue and a charged lysine to non-polar glutamine/threonine transition at the 157th residue . Polar residues are important for protein–protein interactions of the K domain α-helix and changes might disrupt interactions with other proteins . However, since these changes are not correlated with the fruit type, it seems unlikely that any alteration to protein function affects fruit morphology. It is also plausible that any negative effect at these sites is masked by the FUL2 paralog, which is likely to be functionally redundant . This is consistent with FUL1 evolving relatively faster , thus enabling divergence compared to FUL2, 25 liter pot which appears to be more highly functionally conserved based on stricter sequence conservation. None of the sites undergoing positive change in the K domain of MBP10 showed a change in charge, suggesting these changes are not likely to affect protein function. We also observed residues in the M domain that are under diversifying selection in both the FUL1 and MBP10 clades. These residues are located not in the α-helix region that directly binds to DNA, but in the β-sheet region of the MADS domain . β-sheets are important for protein arrangement in three dimensional space. Therefore, any changes in this region might change protein conformation, influencing DNA binding of the α-helix as well as the ability of the euFUL proteins to form higher order complexes . However, these shifts were reversible, with no phylogenetic pattern or change in charge, and there was no correlation with the fruit type. Therefore it is unlikely that these shifts have significant functional impact. A previous report that investigated the evolution of MADSbox genes in A. thaliana also found rapidly evolving sites in the M and K domains of Type II MADS-box proteins, which might have been involved in the functional diversification of this group, but did not report changes in the I domain .
Residues in this domain that are directly involved in forming an α-helix structure are expected to be highly conserved, whereas the remaining residues may not be under such constraints . We found residues in the conserved region of the I domain that are undergoing diversifying selection in both FUL1 and MBP10 clades. Of these, one site in FUL1 and three sites in MBP10 had undergone changes in charge but none were predicted to negatively affect the function . In addition, as with the sites in the M and K domains, none of these was correlated with the Solanaceae phylogeny or changes in fruit morphology. It has been reported that higher rates of substitution in lineages that show weakened purifying selection or even diversifying selection may be occurring at residues of minimal functional importance . This might explain the apparent ease of reversibility and lack of phylogenetic signal among the rapidly changing sites we observed.independent of the reported whole genome events, occurring prior to the diversification of the Brunfelsia clade but after the event that produced the FUL1 and FUL2 clades . The expected topology for the euFULII clade, based on a duplication prior to the divergence of the Brunfelsia clade, would be a paraphyletic grade of pre-duplication euFULII genes, from species that diversified prior to Brunfelsia, and nested MBP10 and MBP20 clades that would include post-duplication genes from all species that diversified subsequent to the duplication. However, in our tree, the pre-duplication genes do not form such a basal grade . Rather, they form a clade with the post-duplication MBP20 genes. The results of our PAML analyses indicate that the MBP20-clade genes show less sequence divergence than MBP10 genes; this higher degree of similarity among pre-duplication sequences and post-duplication MBP20 genes may underlie their grouping into one clade . Our results indicate that the euFULII duplication occurred prior to the origin of the clade containing Brunfelsia. We would therefore expect to find both an MBP10 and an MBP20 in all species of that clade. However, we did not find an MBP10 ortholog in members of this clade other than Brunfelsia. MBP10 appears to have been lost from the genome of Petunia, based on analyses of multiple fully sequenced genomes , and potentially from Plowmania and Fabiana. We were able to recover MBP10 orthologs from Nicotiana and most other later-diverging genera. However, our analysis includes fewer species from the dry grade of the Solanaceae phylogeny than the fleshy-fruited Solanoideae clade and even fewer species that diverged prior to Brunfelsia . In the MBP10 clade in particular, our analysis includes 13 orthologs from species in the fleshy-fruited clade but just four from the dryfruited species, and our analysis only includes sequence data from four genera that diverged prior to the origin of the Brunfelsia clade . Thus there may be genera that originated prior to Brunfelsia that contain MBP10 that our sampling did not include. Floral and fruit transcriptomes, which provided MBP10 orthologs from later diverging species, yielded no MBP10 sequences from Cestrum and Schizanthus; nonetheless, whole genome sequences of early diverging species are needed to determine the timing of the MBP10/MBP20 duplication.