Macadamia genomic resources for South Africa
The genetic diversity and population structure of South African macadamia cultivars present in South Africa was analysed using a panel of 13 microsatellite markers from literature. The population consisted of a total of 110 cultivars that were selected for in South Africa, Australia, Hawaii, California and Israel, as well as a private breeding population from a local farmer. GeneALex, Principal component analysis, Population Structure and Neighbour-Joining phylogenetic tree was used to analysis the data results.
Whole genome sequencing, assembly and and annotation was performed for three different cultivars of importance to South Africa, which included Santa Anna (Macadamia tetraphylla), HAES 695/Beaumont (M. interifolia x M. tetraphylla) and HAES 791 (M. interifolia x M. tetraphylla x M. ternifolia). We produced between 80x and 120x coverage long-read sequencing data on the Oxford Nanopore Technologies (ONT), and combined this with between 60x and 120x coverage of the short-read sequencing data from the Illumina Platform. The genomes were assembled with the Shasta, Flye, Masurca and Necat assemblers, however Necat produced the best consensus assemblies. The Necat consensus was further polished and error corrected with three rounds of Racon and one round of Pilon analysis. The final polished Necat assemblies were then purged of redundant sequences to produces haploid assemblies for all three genomes. The quality of the genomes were analysed using the BUSCO and Quast pipelines. Genome annotation was performed using the BRAKER3 pipeline. The input data for BRAKER3 included the repeatmasked genome assemblies (done using RepeatModeler and RepeatMasker), the merged .bam files (from mapping the RNA-Seq data to the masked genome assemblies), and the OrthoDB v11 Viridiplantae protein database. The TSEBRA pipeline was used to combine the most robust gene sets with the best BUSCO completeness scores.
Fatty acid biosynthesis genes were analysed to try to determine the underlying mechanisms for the unique oil profile of macadamia nuts. Four gene families were analysed, which included fatty acid desaturase type I (FAD), fatty acid desaturase type II (SAD), beta-ketoacyl-ACP synthases (KAS), and acyl-ACP thioesterase (FAT), which play key roles in unsaturated fatty acid biosynthesis. OrthoFinder and OrthoVenn were used for the analysis, and the genes we annotated were compared to other tree nut and oil producing crop species. other species included walnut (Juglans regia), pistachio (Pistacia vera), almond (Prunus dulcis), hazelnut (Corylus avellana), sunflower (Helianthus annus), olive (Olea europaea), and cocoa (Theobroma cacao) species. The results were analysed using ClustalW alignments and RaXML-ng and viewed using phylogenetic trees and drawn on the iTOL online platform.
Funding
Macadamias South Africa
History
Department/Unit
Biochemistry, Genetics and MicrobiologySustainable Development Goals
- 9 Industry, Innovation and Infrastructure
- 2 Zero Hunger
- 12 Responsible Consumption and Production