Technical Specifications for OARv2.0

OARv2.0 was built using next generation sequence derived from one female and one male Texel. The primary de novo assembly was performed using 75 fold Illumina GA sequence from the female, before the mate pair characteristics of the paired end reads were used to produce scaffolds spanning 2.71 Gb or approximately 91% of the sheep genome (scaffold build v1.2).

A further 45 fold coverage of the male Texel was used for gap filling, before scaffolds are anchored onto the 27 sheep chromosomes. Scaffolds that are clearly chimeric were identified by comparison with the bovine UMD3 assembly and manually split in the gap between adjacent contigs mapped to two different bovine chromosomes. Superscaffolds were built from the set of scaffolds and split scaffolds >2 kb in length using the end sequences derived from the male Texel BAC library, CHOR-243 and the predicted locations on OARv1 of SNPs included on the Illumina Ovine SNP50 BeadChip. This was undertaken as a single integrated process and non-congruent BACs and out of position SNPs were minimised.

Several rounds of manual checking and final error correction were carried out using the end sequences of the BACs in the bovine CHORI-240 library and 454 mate pair sequence data derived from 8kb and 20kb insert libraries of the male Texel. Ambiguous positions were resolved using the predicted location of the SNPs based on OARv1.0 and conserved synteny with the UMD3 bovine genome assembly. Superscaffolds were initially ordered and oriented into chromosomes using the locations of the SNP in OARv1.0. The positions of the SNPs in the sheep linkage map and the sheep RH map were used to identify remaining errors and to refine the assembly.