Breeders increasingly are focused on meeting the needs of consumers, but genetic improvement of flavor is challenging as a consequence of the chemical and genetic complexities of the flavor phenotype . These challenges are accentuated in heterozygous, polyploid species. For example, fewer significant single nucleotide polymorphisms were detected in genome-wide association study of tetraploid blueberry when diploid models were applied ; in octoploid strawberry, structural variation underlying a locus affecting volatile production was difficult to resolve using a single reference genome . Recent advances have been made via chemical–sensory studies to identified specific volatiles associated with consumer preference . Although important volatile compounds in fruit crops are being identified, too little is known about the metabolomic and genetic diversity within species and breeding populations. Some volatiles have been lost during domestication and breeding as a combined result of negative selection and linkage drag in tomato and watermelon . Likewise, gain and loss of terpene compounds during strawberry domestication and its genetic causes have been investigated . Recent advances in sequencing technology and analytical approaches have opened new opportunities to understand the chemistry and genetics of fruit flavor. Meanwhile, genomes-wide expression quantitative trait loci studies have the capability to bridge the gaps between GWAS signals and their underlying causative genes. Integration of GWAS and eQTL studies has led to discovery of a master metabolite regulator in tomato and a flesh-color-determining gene in melon .
Long-read sequencing now allows assembly of genomes with high contiguity, and when coupled with parental short-read data ,large pot with drainage the two haplotypes of a heterozygous individual can be fully resolved. Phased assemblies have improved variant discovery, especially for large structural variants . The extent, diversity and impact of SVs increasingly are being studied in horticultural crops and have been shown to alter fruit flavor, fruit shape and sex determination . Great opportunity exists to coherently integrate these multi-omics resources for the discovery of flavor genes. Garden strawberry is an allo-octoploid species with highly palatable non-climacteric fruit . It increasingly has been utilized as a model for Rosaceae fruit crops genomics and flavor research as a result of its short generation time, wide cultivation and high value. Through exploration of spatiotemporal changes in gene expression and homolog search, several flavor genes have been cloned and validated, including an alcohol dehydrogenase and several alcohol acyltransferases for esters, a nerolidol synthase 1 for terpenes and a quinone oxidoreductase for furaneol. Recently, QTL studies and transcriptome data analyses for strawberry volatiles using biparental crosses have detected QTL and causative genes for mesifurane and gamma-decalactone . Nevertheless, low mapping resolution and a lack of subgenome-specific markers have hampered further characterization of causal genes underlying other QTL. This problem recently was addressed by the development of 50K Fana SNP array using probe DNA sequences physically anchored to the octoploid ‘Camarosa’ genome . High heterozygosity combined with an allopolyploid genome presents difficulties for resolving causative genes and their haplotypes. To further the goal of discovering causative genes affecting flavor in strawberry, association studies with larger sample sizes and additional genetic resources such as eQTL and additional genomes are required. Furthermore, these resources must span the breadth of natural variation in breeding germplasm.
Here we present multi-omics resources consisting of an eQTL study representing the genetic diversity of strawberry breeding programs in the US, phased genome assemblies of a highly- flavored University of Florida breeding selection, a structural variation map in octoploid strawberry and a volatile GWAS of 305 individuals. These are combined to leverage the extensive metabolomic, genomic and regulatory complexity in strawberry for the discovery of natural variation in genes affecting flavor. Ultimately, the functional alleles identified will be selected in breeding to achieve superior flavor.The eQTL population consisted of 196 genotypes including 133 newly sequenced accessions . The University of Florida genotypes were grown at GCREC and collected in the spring of 2020 and 2021. The University of California-Davis collection of diverse selections from multiple breeding programs were grown at either Santa Maria CA or Oxnard CA, for day-neutral and short-day accessions, respectively, and collected in the spring of 2021. Four UC genotypes were collected at both sites to ensure sequencing and SNP quality. Total RNA was extracted from a bulk of three fully ripe fruits using a Spectrum™ Plant Total RNA Kit , after flash freezing in liquid nitrogen. Illumina 150-bp pair-end sequencing was performed on the Illumina NovoSeq platform by Novogene Co. . On average, 6.9 Gb of sequence data were obtained for each sample. Raw RNA-Seq data of 63 samples from previous published studies were retrieved from the NCBI SRA database . In order to quantify gene expression, short reads were trimmed for adapter sequences and low-quality reads with TRIMMOMATIC v.0.39 and aligned against the reference genome using STAR v.2.7.6a in the two-pass mode . Only unique aligned reads were scored by HTSEQ v.0.11.2 in the union mode with the ‘–nonunique none’ flag supplied with the latest Fragaria_ananassa_v1.0.a2 annotation . All count files were compiled in R and normalized with the DESEQ package . To generate the marker dataset for eQTL mapping, SNPs and InDels were called using the mpileup and call commands. Markers were further hard-filtered using BCFTOOLS with the following steps: individual calls with lower than sequencing depth of three were set to missing using + setGT plugin; marker sites with quality < 30, missing rate > 0.3, heterozygous call rate > 0.98, minor allele frequency < 0.05, or number of alternative alleles > 1 were purged; the filtered markers were imported and analyzed in R, and only markers showing more than three matched calls in four duplicated sample pairs were retained.
A total of 491 896 markers passed the three stages of filtering. The missing calls were imputed, and all calls were phased using BEAGLE v.5.2 using the default settings . The eQTL mapping was performed for 62 181 fruit expressed genes using the filtered markers. Linear mixed models implemented in GEMMA were used for association analysis . The relationship matrix was computed in GEMMA and supplied to explain relationship within populations, and the top five principal components with a total of 25.0% variance explained were imported as covariates to reduce effects from population stratification to signify the genetic variance underlying the target traits. The Bonferroni corrected 5% significance threshold was used, determined the by number of LD-pruned markers . The approach to define an eQTL was similar to that used in previous studies . Briefly, we first clustered all significant markers with distance < 100 kb and purged clusters with fewer than three markers. The lead marker with lowest P-value was used to identify the eQTL, and boundaries of eQTL were defined as the furthest flanking significant markers. Clusters in LD were merged and boundaries were updated. The longest distance between cis-eQTL boundaries and eGene boundaries was limitedto 500 kb. Trans-eQTL hotspots were searched using the density function in R .In this study we leveraged eQTL, GWAS and haplotype-resolved genome assemblies of a heterozygous octoploid to identify allelic variation in flavor genes and their regulatory elements. Fine tuning of metabolomic traits such as amylose content in rice and sugar content in wild strawberry recently were made possible via CRISPR-Cas9 gene-editing technology. Similar approaches can be taken in cultivated strawberry for flavor improvement, but not before the biosynthetic genes responsible for metabolites production and their regulatory elements are identified. Our pipeline has proven to be effective in identification of novel causal mutations for flavor genes responsible for natural variation in volatile content and can be further applied to various metabolomic and morphological aspects of strawberry fruit such as anthocyanin biosynthesis , sugar content and fruit firmness. These findings also will help breeders to select for genomic variants underlying volatiles important to flavor. New markers can be designed from regulatory regions of key aroma volatiles, including multiple medium-chain volatiles shown to improve strawberry flavor and consumer liking , methyl thioacetate contributing to overripe flavor and methyl anthranilate imparting grape flavor . In the present study, a new functional HRM marker for mesifurane was developed and tested in multiple populations . These favorable alleles of volatiles can be pyramided to improve overall fruit flavor via marker assisted selection. Strawberry also shares common volatiles with a variety of fruit crops. Specific esters are shared with apple ,drainage collection pot certain lactones are shared with peach and various terpenes are shared with citrus . Syntenic regions and orthologous genes could be exploited for flavor improvement in those species. Additional insights were gained for the strawberry gene regulatory landscape, SV diversity, complex interplays among cis- and trans- regulatory elements, and subgenome dominance. Previously, Hardigan et al. and Pincot et al. showed a large genetic diversity existing in breeding populations of Fragaria × ananassa, challenging previous assumptions that cultivated strawberry lacked nucleotide variation owing to the nature of its interspecific origin and short history of domestication .
Our work corroborated their findings and showed that even highly domesticated populations harbor substantial expression regulatory elements and structural variants. Over half of the expressed genes in fruit harbored at least one eQTL, and 22 731 eGenes had impactful cis-eQTL. The distribution of trans-eQTL is not random, but rather is concentrated at a few hotspots controlled by putative master regulators . The aggregation of trans-eQTL also was observed in plant species such as Lactuca sativa and Zea mays . Furthermore, we observed a substantial number of trans-eQTL among homoeologous chromosomes, similar to observations in other allopolyploid plant species . In cotton, physical interactions among chromatins from different subgenomes have been identified via Hi-C sequencing , supporting a potential regulatory mechanism among homoeologous chromosomes. However, owing to the high similarity among four subgenomes and limited length of Illumina reads, false alignment to incorrect homoeologous chromosomes could arise, leading to ‘ghost’ trans-eQTL signals. Future studies are needed to scrutinize the homoeologous trans-eQTL and investigate the mechanism behind this genome-wide phenomenon. Higher numbers of trans-eQTL in the Fragaria vesca-like subgenome are consistent with its dominance in octoploid strawberry . By contrast, the highly mixed Fragaria viridis- and Fragaria nipponica- like subgenomes contained much smaller numbers of trans-eQTL. The characterization of naturally-occurring allelic variants underlying volatile abundance has direct breeding applications. First, this will facilitate the selection of desirable alleles via DNA markers. Second, understanding the causal mutations in alleles can guide precision breeding approaches such as gene editing to modify the alleles themselves and/or their level of expression. From a broader perspective, multi-omics resources such as this one will have value for breeding a wide array of fruit traits. Enhancing consumer satisfaction in fruit ultimately will depend on the improvement of the many traits that together enhance the overall eating experience.The gastrointestinal tract, especially the large intestine, houses the most abundant and complex microbiota in humans. Most of intestinal bacteria belong to the phylum Firmicutesand Bacteroidetes , which make up more than 90% of known phylogenetic categories and dominate the distal gut microbiota. Other lower abundance bacteria include Actinobacteria, Fusobacteria, Proteobacteria, and Verrucomicrobia. Diet is one of the important factors contributing to the gut microbial composition that ultimately affects human health. Obesity and associated metabolic diseases, including type 2 diabetes, are intimately linked to diet . A number of recent in vitro, in vivo, and human studies showed that polyphenols or polyphenol-rich dietary sources, particularly tea, wine, cocoa, fruits, and fruit juices, influence the relative abundance of different bacterial groups within the gut microbiota byreducing the numbers of potential pathogens and certain gramnegative Bacteroides spp. and enhance beneficial bifidobacteria and lactobacilli . Spices are derived from bark, fruit, seeds, or leaves of plants and often contain spice-specific phytochemicals. Spices have been used not only for seasoning of foods but also for medicinal purposes, and have a number of demonstrated disease preventive functions such as antimicrobial, antiinflammatory, antimutagenic activities, and are known to reduce the risk of cancer, heart disease, and diabetes . They are best known for their strong antioxidant properties that exceed most foods. It was reported that of the 50 food products highest in antioxidant concentrations among 1113 U.S. food samples, 13 were spices. Among them, oregano, ginger, cinnamon, and turmeric ranked #2, 3, 4, and 5, respectively .