Big data, big differences between Cabernet clones
The massive advances in cost-effective state-of-the-art whole genome sequencing (WGS) technologies have provided a unique opportunity to crack the genetic code of agricultural crops such as the grapevine. In this study, researchers are resolving the fingerprints of ten top Australian Cabernet Sauvignon clones in order to identify genetic markers of their identity.
This information will allow researchers in Australia to further explore these spontaneous mutations, and how they relate to variation in berry and wine quality. The information will allow researchers to study the regional effect on expression of clonal quality traits, potentially linking to other research such as the ‘Clones for Climate’ study.
To develop the fingerprints, the team applied genome resequencing of the ten clones, with some replicate vines included (Illumina HiSeq1500, paired-end libraries). The clones were SA125, SA126, CW44, ENTAV338, PDFs, Reynella (mass selection), WA Cape Selection (mass selection) and a selection of ‘Houghton’ clones from two locations.
SA126 was chosen for further replication to ensure that differences identified truly reflected clonal fingerprints. The sequence data were mapped to the Cabernet Sauvignon reference genome, which was only just published this year (Chin et al., 2016). The data represented an approximate 30X depth of sequencing and 90% coverage of the genome. So far so good.
Nevertheless, identifying and validating the fingerprints has proven to be a significant challenge. A number of analytical approaches were applied in order to yield robust fingerprints. Following this pipeline, a total set of 33 362 single-nucleotide polymorphisms (SNPs) were found to discriminate the total set of clones. When researchers compared clones one-to-one, they found they could discriminate each clone using between 1780 to 5100 SNPs. These are considerable differences considering they are all Cabernet Sauvignon!
The data analysis is not yet finished as there are other elements to explore, including the effect of viral infection on the variation in DNA, as well as mobile genetic elements, which may explain some of the larger sequence differences. Importantly, the researchers hope this set of data can be made available to industry and other researchers so that it can be mined for more information as new technologies and understanding emerge.
This research is funded through an Australian Research Council Linkage Project based at the University of Western Australia (UWA), with financial support from the Department of Agriculture and Food WA (DAFWA) and the WA wine industry through the Western Australian Vine Improvement Association (WAVIA), and in-kind contributions from the Yalumba Nursery and the Australian Wine and Research Institute (AWRI).
Chin CS, Peluso P, Sedlazek FJ, Nattestad M, Concepcion GT, Clum A, Dunn C, O'Malley R, Figueroa-Balderas R, Morales-Cruz A, Cramer GR, Delledonne M, Luo C, Ecker JR, Cantu D, Rank DR, Schatz MC. 2016, ‘Phased diploid genome assembly with single-molecule real-time sequencing’. Nat Methods, vol. 13(12), pp. 1050-1054. doi: 10.1038/nmeth.4035