In AGOUEB (and its sister project “BarleyCAP” (http://barleycap.coafes.umn.edu/) we have developed two Illumina Oligo Pool Assay (OPA) based SNP genotyping platforms to use for genetic diversity and LD analysis. We have developed three 1536 SNP Pilot OPAs from which we have selected the best 3072 SNP to construct Barley OPA1 and Barley OPA2 (BOPA1 and BOPA2). The OPA platform uses Illuminas ‘GoldenGate assay in conjunction with their BeadArray Technology. A glossy web-based overview of the Illumina technology can be found here (http://www.illumina.com/).
The GoldenGate Assay
The Illumina GoldenGate Assay queries genomic DNA with three oligonucleotide probes for each locus and creates DNA fragments that can be amplified by standard PCR methods using universal primers. For each of the 1536 loci interrogated in each assay, the oligo mix contains 2 allele specific and one locus specific probe (ie 3 x 1536 oligos). The 3′ ends of the two alternative allele specific probes are complementary to two universal primers, U1 and U2, with the 5′ end complementary to the 3′ end of the locus. Each probe sequence terminates at the SNP that is to be assayed with an allele specific base. The third probe, the locus specific probe, is complementary to the genomic DNA that starts 5 to 20 bases 3′ of the locus in question. As well as a locus specific sequence, this probe also contains a specific Illumicode sequence that is used to identify the locus (on the BeadArray) as well as the sequence for universal primer sequence U3. All 1536 probes are annealed to the genomic DNA at the same time, DNA polymerase is added to close the gap between the allele specific (including either U1 OR U2) and the locus specific (including U3) probes and the paired fragments are ligated together. The probe fragments are then separated from the genomic DNA and used to inoculate a PCR reaction. The primer mix for this PCR reaction consists of primers U1 and U2 labeled with Cy3 and Cy5 respectively and biotinylated primer U3. As a result of this labeling scheme, the PCR product consists of double stranded DNA of which one strand, containing the complement to the Illumicode, is labeled with either Cy3 or Cy5 in an allele specific manner, and a complementary strand labeled with biotin. The biotinylated strand is removed and the single, florescently labeled strand hybridized to the BeadArray.
The Sentrix Array Matrix (SAM) is an array of fiber optic bundles in 96 well plate format. Each bundle contains 50K separate fiber optic fibers with end of each fiber etched to create a pocket. Beads 3um in diameter are individually coated with hundreds of thousands of copies of a single stranded oligonucleotide, (Illumicode addresses) with only one sequence on any given bead. A mixture of the coated beads is laid across the fiber bundle such that a single bead becomes lodged in the end of each fiber. This process results in a randomly ordered array of capture Illumicodes. The manufacturer performs a quality control process to verify that beads have been deposited, to map the Illumicode that is represented in each fiber and to verify that each Illumicode is represented a minimum of 5 times within each of the 96 arrays within a SAM. On average there is 30X redundancy within a give array. Each SAM is shipped with a corresponding information file denoting which Illumacode is represented at each location in the array.
The Oligo Pool (OPA)
Any specific GoldenGate assay requires a particular pool of oligonucleotides corresponding to the allele and locus specific probes for the loci that will be interrogated. Any given oligo pool can interrogate up to 1536 different SNP loci. Multiple oligo pools can be run on sample sets to increase the number of loci queried. Hence in AGOUEB and BarleyCAP have two to assay 3072 different SNPs. Oligo pools are shipped with their own information file designating which Illumicode is used to interrogate each locus as well as which allele is labeled with Cy3 and which with Cy5.
In AGOUEB we use the Genotyping facility at the University of California Los Angeles to conduct the practical aspects of genotyping. There, their BeadArray reader scans the hybridized SAM and determines the signal intensities for each dye at each bead location. Custom software from Illumina uses the information files from the SAM and the oligo pool to map the known location of each Illumicode on the Sentrix Array Matrix back to the locus being interrogated by that Illumicode and to match the dye intensities to the specific alleles. At SCRI we receive the information as datafiles that we interrogate and analyse using Illumina’s BeadStudio software. The dye intensities are examined by the software to determine the genotype of each sample for that locus. A locus returning predominantly signal from Cy3 is AA, Cy5 is BB and an even ratio represents a heterozygous individual. Data is returned with the allele call for each locus as well as a something called a Gentrain score, a measure that represents the reliability of that genotyping call. We examine each locus independently to make sure that the assigned genotypes are robust – though we have found that loci with high gentrain scores usually require no manual intervention. However, some assays do not perform quite as well. There are a number of reasons for this (e.g. a null allele) and some of them have been previously described. For Barley, the AA, AB or BB scores are exported in the form of an excel spreadsheet into a Germinate database (http://bioinf.scri.sari.ac.uk/germinate/ ) for archiving. In Germinate the data can be converted into various formats for data analysis in third party and specific analytical software (e.g TASSEL http://www.maizegenetics.net/index.php?page=bioinformatics/tassel/index.html).
For further information on this project please contact Bill Thomas (firstname.lastname@example.org) from the James Hutton Institute.