WO2016189288A1 - Nucleic acid sample enrichment - Google Patents

Nucleic acid sample enrichment Download PDF

Info

Publication number
WO2016189288A1
WO2016189288A1 PCT/GB2016/051475 GB2016051475W WO2016189288A1 WO 2016189288 A1 WO2016189288 A1 WO 2016189288A1 GB 2016051475 W GB2016051475 W GB 2016051475W WO 2016189288 A1 WO2016189288 A1 WO 2016189288A1
Authority
WO
WIPO (PCT)
Prior art keywords
adaptor
sample
dna
population
nucleic acid
Prior art date
Application number
PCT/GB2016/051475
Other languages
French (fr)
Inventor
Neil Matthew BELL
Andreas Claas
Tobias William Barr Ost
Floriana MANODORO
Malwina PRATER
Shirong YU
Original Assignee
Cambridge Epigenetix Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GBGB1508859.4A external-priority patent/GB201508859D0/en
Priority claimed from GBGB1514532.9A external-priority patent/GB201514532D0/en
Priority claimed from GBGB1522270.6A external-priority patent/GB201522270D0/en
Application filed by Cambridge Epigenetix Ltd filed Critical Cambridge Epigenetix Ltd
Publication of WO2016189288A1 publication Critical patent/WO2016189288A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means
    • C12Q1/6818Hybridisation assays characterised by the detection means involving interaction of two or more labels, e.g. resonant energy transfer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions

Definitions

  • This invention relates to the preparation of nucleic acid samples for analysis and certain methods and tools for the selection of specific regions of interest from a nucleic acid sample.
  • Single stranded sample preparation is commonly required following bisulfite conversion of DNA molecules.
  • the bisulfite conversion process necessarily results in the formation of single stranded DNA, and therefore involves either i) pre-bisulfite sample preparation or ii) post-bisulfite sample preparation employing random priming for downstream analysis.
  • Drawbacks to these methods include the potential to generate nicked or fragmented libraries incapable of subsequent amplification, the loss of sequence information from the parent DNA molecules, generation of artefacts that contaminate the sample of interest or induce significant representation bias of reads in the final dataset.
  • a direct method of ligating the termini of single stranded DNA post-bisulfite treatment in quantitative yield is of significant interest.
  • Targeted enrichment is a method used commonly in genomic and epigenomic analysis to reduce the complexity of the genome being studied and to home-in on specific regions of interest (e.g. exomes, CpGs, specific genes etc). This allows the cost of sequencing to be decreased dramatically and the complexity of analysis simplified.
  • Targeted enrichment has been shown to be an effective and reliable alternative method to whole genome analysis in situations where only a fraction of the genome needs to be interrogated or where experiments are done at such scale (numbers of samples) that to do sequencing analysis at a whole genome scale becomes cost prohibitive.
  • pairs of primers for each amplicon, or locus are required and this can limit the complexity (number of locus amplicons targeted) of a single enrichment event; also if using primers with a 5'-flap non-complementary to the loci but complementary to the sequencing platform intended for analysis, this can simplify the workflow but decrease primer specificity, increase cost of the targeting primers and decrease the complexity of targeting.
  • hybridisation arrays or in-solution capture methods typically these target adapted fragments prepared from double stranded (ds) DNA and as such limit their usefulness to native DNA applications. Methods do exist that allow targeted enrichment of bisulfite converted DNA (e.g.
  • Methods described herein can utilise a single primer per loci (not a pair) which should allow improvements in specificity and enable more loci to be targeted within a single reaction. Only 2 primers per region will give information on the top and bottom converted strand of a selected region.
  • hybridisation pull-down methods are inefficient. Working with low mass samples can be a problem simply due to liquid handling losses and multiple transfer steps.
  • a small subset of fragments e g the 1-2 % of the genome represented by the exome
  • the vast majority of the sample remains in solution, so the amount of material required as the input sample is high as most of the sample is not captured.
  • Elution of captured DNA from beads is also inefficient. Methods described herein can be used as a one-pot reaction that should offset these issues and will not be reliant on bead purifications and pulldowns.
  • the IlluminaTM Human Methylation 450K Array is an example of a targeted enrichment array format that is well utilised by the epigenetics research community, but has several drawbacks. It is limited to a physical array format which yields non-digital data and makes sample multiplexing at high number cumbersome and dependent on sophisticated automation. The method is only compatible with Human samples, limiting its usefulness. The number of probes and their specificity is high (-480,000) but the method relies on whole genome amplification using random priming and as such is potentially biased.
  • Methods described herein can be sequencing based and so would yield digital data, are agnostic to species (target primer pools could be designed to any species) and there is less inertia required to prepare multiple different targeting pools than is the case for array-based techniques.
  • Sample multiplexing can be significantly simpler, and uses approaches common in the art such as molecular barcodes (tags) attached to identify different populations of primers.
  • PCR is inherently biased, particularly when amplifying bisulfite converted (AT-rich) DNA. Data generated by methods dependant on PCR will necessarily be of lower quality and more biased than those which no not employ PCR. Methods described herein method would work with or without PCR, so not only can the effect of PCR be evaluated, it can be eliminated as a source of doubt in the targeting experiments.
  • Amplification of regions of greater than around 400bp from bisulfite converted DNA is difficult due to the fragmentation induced by the bisulfite conversion process. This fragmentation is random, but decreases the molarity of fragments in the pool that remain intact in the region of interest. Higher concentration of template is typically required in the amplicon reactions in order to account for this, which decreases the specificity of targeting and increases the sample mass burden required per amplicon reaction (which can be a problem for precious samples). The same is true for methods that depend on targeting a converted adapted NGS library. The conversion process fragments the library, decreasing the molarity of intact fragments (DNA inserts flanked by two universal primer regions) which decreases the diversity (number of unique fragments) in a given sample.
  • Methods described herein allows for the targeted enrichment of converted, single stranded fragments derived directly from genomic DNA, RNA or alternative samples via amplification or selection of mono-adapted fragments.
  • the method relies on the attachment of an adaptor to one end of each of the fragments to make a mono-adapted sample library, which can be targeted using locus specific primers.
  • Hybridisation of locus specific primers with the fragmented sample and extension of the primers hybridised to the sample results in selection of a subset of the monoadapted fragments, which can then be amplified using the locus specific primers and a primer having the same sequence as part or all of the adaptor.
  • Attaching adaptors before fragmentation causes the majority of fragments to lose the adaptor.
  • the method described herein are highly efficient in accurately selecting the desired regions from a low amount of input material, and are therefore advantageous over prior art methods involving the targeted selection of single stranded samples.
  • a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
  • a fragmenting the sample to produce a population of single stranded sample fragments; b. attaching an oligonucleotide adaptor to the 3 ' end of each of the strands in the sample wherein the adaptor is a hairpin containing a cleavage site;
  • a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
  • each extension product includes a copy of the adaptor; e. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps d and e occur simultaneously or sequentially; and
  • the single stranded sample fragments have an adaptor attached to only one end prior to hybridising the population of locus specific primers.
  • the samples are fragmented prior to attachment of the adaptors, or the sample originates as a single stranded population of nucleic acids such as RNA or a degraded samples such as FFPE samples or cell free nucleic acid fragments.
  • amplification step prior to the hybridisation of the locus primers includes a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
  • each extension product includes a copy of the first adaptor
  • a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
  • each extension product includes a copy of the first adaptor or a fragment thereof;
  • the sample can originate as a single stranded sample, or the sample can be fragmented.
  • Single stranded samples may include RNA, including mRNA, siRNA or micro RNA.
  • the sample may be fully or partially in single stranded form due to degredation or due to the harsh processes used in fixing or unfixing the samples. The method works well on samples such as FFPE degraded samples, or on samples of ancient DNA such as neanderthal DNA.
  • Fragmentation of the sample can be caused by bisulfite treatment.
  • Bisulfite treatment converts non-methylated cytosine bases to uracil bases, thereby reducing the number of C bases in the sample.
  • the locus specific primers can contain a higher level of the three nucleic acid bases A, C and T such that the locus is complementary to a sample treated with bisulfite (i.e. the primers can contain little or no G).
  • the sample can be copied initially and therefore the locus specific primers can contain the three nucleic acid bases A, G and T such that the locus is equivalent to a sample treated with bisulfite (i.e. the primers can contain little or no C).
  • the locus can contain all four bases.
  • the population of locus specific primers can contain both primers with A, C and T and primers with A, C, G and T. Primers having both G and T in equivalent positions can be used to identify specific methylation sites if required.
  • the population of locus specific primers can contain both primers with A, G and T and primers with A, C, G and T. Primers having both C and A in equivalent positions can be used to identify specific methylation sites if required.
  • the methylation sites of interest may be in the extension region adjacent to the 3 ' end of the primers.
  • the method can work robustly with millions of loci targeting probes.
  • the population of locus specific primers can be at least 50 different sequences.
  • the population of locus specific primers can be at least 100 different sequences.
  • the population of locus specific primers can be at least 1000 different sequences.
  • the population of locus specific primers can be at least 10,000 different sequences.
  • the population of locus specific primers can be at least 100,000 different sequences.
  • the population of locus specific primers can be at least 1,000,000 different sequences.
  • the population of primers can contain a universal region common to all primers.
  • the universal region is preferably not complementary to the sample of interest.
  • the universal region can be located at the 5' end of the primer.
  • the non-sample complementary region may include an identifier region or tag to enable sample multiplexing.
  • any method of attaching an oligonucleotide adaptor can be used.
  • the method should be capable of the template independent joining of two single stranded oligonucleotides.
  • the method described herein can add a single stranded oligonucleotide with a 3 ' hydroxyl to a single stranded oligonucleotide with a 5 '-triphosphate moiety.
  • the adaptor can be a hairpin, optionally having a cleavage site.
  • the method of attachment of an oligonucleotide adaptor should add a single oligonucleotide adaptor to the fragment. Polymerisation of the oligonucleotide adaptor is undesireable for subsequent analysis of the samples. Thus the adaptor should not carry an extendable 3' hydroxyl group, thereby ensuring a single adaptor is added.
  • Methods of attaching adaptors include the use of ligases or polymerases.
  • the method may involve the use of splints in order to turn the ends of the single stranded fragments partially double stranded.
  • the splint and/or adaptor may be DNA, RNA or a mixture thereof.
  • a first extension reaction extends the 3 ' end of the hairpin adaptor to produce a copy of the fragment.
  • the single stranded fragments having a hairpin attached thereto are thus made double stranded.
  • the hairpin may contain a specific cleavage location, cleavage of which allows the strands to be separated, thus producing mono-adapted single strands.
  • An optional amplification can be carried out using a second adaptor prior to the selection using the locus specific primers. Such a method with and without the optional amplification is shown in figure 4.
  • the cleavage or amplification are two alternative ways of arriving at a single stranded sample for selection.
  • the single stranded sample for selection can be a copy of the first single stranded sample absent from the first single strands, or can contain the denatured first single stranded sample and strands which are copy thereof.
  • a second extension, or amplification can be carried out using locus specific primers, thus selecting and optionally amplifying, the desired fragments, each fragment already being mono-adapted or bi-adapted.
  • an extension, or amplification can be carried out using locus specific primers, thus selecting and optionally amplifying, the desired fragments, each fragment already being mono-adapted or bi-adapted.
  • the extended locus specific primers can be copied, and therefore amplified, using a 'fixed' primer common to all fragments, having a sequence in common with a portion of the hairpin adaptor.
  • the extension reactions may be carried out using a nucleic acid polymerase.
  • the extension reactions may be carried out using nucleotide triphosphates.
  • the extensions can be carried out using four nucleotide triphosphates.
  • the hybridisation and extension cycles can be repeated.
  • the primer hybridisation can be carried out in a single cycle, or the locus specific hybridisation and subsequent extension can be repeated, for example using thermocycling before the fixed anchor primer is then added to initiate exponential amplification.
  • the extended locus hybridised strand should terminate in a 3 '-hydroxyl group.
  • the extended primers have a newly generated 3 ' end, the 3 ' hydroxyl group resulting from an incorporated nucleotide (from the dNTP).
  • the 3 ' hydroxyl group in the extended strand may terminate at the cleavage point of the hairpin adaptor.
  • the 3 ' end of the extended strand is thus complementary to a portion of the hairpin adaptor.
  • the 3 ' end of the extended strand may be at the 5' end of the single stranded adaptor.
  • the adaptor can be attached using a ligase.
  • the adaptor can be attached using a polymerase.
  • the polymerase may be a template independent polymerase.
  • the template independent polymerase can be terminal transferase (TdT).
  • the template independent polymerase can be polyadenylate polymerase (PAP).
  • the adaptor can contain a triphosphate moiety.
  • the adaptor can contain a region of known sequence, allowing amplification via hybridisation to a portion of the adaptor, or a copy thereof.
  • the monoadapted templates can be amplified using a fixed primer at one end, and the locus specific primers at the other end. Thus rather than requiring 2n primers to amplify n loci, the number of primers required is n +1. For example 10,000 loci can be amplified using 10,001 primers rather than 20,000 primers.
  • the first or second extension and or amplification reaction may be carried out using a nucleic acid polymerase.
  • the extension reaction may be carried out using nucleotide triphosphates.
  • the extension can be carried out using four nucleotide triphosphates.
  • the four nucleoside triphosphates may be dCTP, dATP, dGTP and dTTP.
  • the use of dTTP allows strands made using dUTP in the first extension to be cleaved, leaving solely the second extension strands intact.
  • the extension strands may be treated to remove the uracil bases, thereby allowing selective strand cleavage.
  • the extension strands may be treated with the enzyme UDG, resulting in abasic sites, which can then be cleaved, for example using an endonuclease, heat treatment or increasing the pH of the buffer.
  • the products from the amplification contain an end having a fixed sequence derived from the adaptor. If the locus specific primers have a fixed as well as a variable region, then the resultant amplified fragments can be further amplified and analysed using a standard pair of fixed primers. Alternatively a further adaptor can be attached to the 'mixed' or locus ends of the amplified fragments. The adaptor can be attached using a ligase.
  • the adaptor can be attached using a polymerase, for example a template independent polymerase.
  • the template independent polymerase can be terminal transferase (TdT).
  • TdT terminal transferase
  • a second (or third) adaptor, or 'fixed sequence' can be attached to the copied fragments.
  • the second (or third) fixed sequence adaptor can therefore either be present as part of each of the variable the locus primers, or attached after the amplification/selection steps.
  • the inventors have found it important to use a second adaptor that is different from the 'fixed sequence' adaptor end on the loci primers. If the 'amplification' adaptor and fixed sequence adaptor are the same, cross hybridisation occurs between the loci primer ends and the library adaptors. This causes nonspecific binding of probes and lots of off-target reads (diluting enrichment). Whilst this can be overcome to some extent by supplementing the hyb mix with "suppression oligos" that are complementary to the library adaptor sequences, this increases the complexity of the mix. The inventors herein have developed an alternative solution by using a different adaptor for the second adaption step that is unrelated to the loci-end. This means suppression oligos are not required.
  • the method results in products having known regions at either end and copies of the nucleic acid sample fragments centrally between the known ends.
  • the only fragments having known adaptors are ones selected using the locus hybridisation and extension. Fragments not having the desired locus remain mono-adapted and are thus lost from the analysis.
  • the copied fragments can therefore be amplified using primers complementary to the two adaptors or copies thereof.
  • a sequencing step can be carried out on the amplified mixture.
  • the adaptor may contain a triphosphate moiety.
  • the triphosphate moiety may attach to the 3 '- hydroxyl of the fragmented sample strands.
  • the triphosphate moiety may be attached to the 5'- end of the nucleic acid adaptor via a linker.
  • the linker may contain a nucleotide having a ribose or deoxyribose moiety, with the oligonucleotide adaptor attached via the nucleotide base.
  • the adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded or double stranded.
  • the double stranded adaptor has at least one overhanging single stranded region, and may have two or three overhanging single stranded regions. At least one of the overhangs serves to act to hybridise to the end of the single stranded fragments to which the adaptor is to be attached, and acts as a site which can undergo polymerase extension to make the attached single stranded fragments double stranded.
  • the adaptors can be 'forked' adaptors having regions which are non-complementary as well as regions which are complementary.
  • the adaptor may take the form of a hairpin.
  • a hairpin is a nucleic acid sequence containing both a region of single stranded sequence (a loop region) and regions of self-complementary sequence such that an intra-molecular duplex can be formed under hybridising conditions (a stem region).
  • the stem may also have a single stranded overhang.
  • the overhang may contain a degenerate sequence or may be a region of known sequence.
  • Also disclosed herein is a method comprising attaching a single stranded adaptor to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono- adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to the adaptor.
  • Disclosed herein is a method of joining a first single stranded oligonucleotide and a single stranded oligonucleotide adaptor using a template independent nucleic acid polymerase enzyme and selectively amplifying a subset of the joined products using a primer mixture containing a plurality of locus specific primers and a primer complementary to the adaptor.
  • Also disclosed herein is a method comprising attaching a hairpin to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono-adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to a portion of the hairpin.
  • oligonucleotide adaptor takes the form of a hairpin having a single stranded region and a region of self- complementary double stranded sequence capable of forming a duplex under hybridising conditions.
  • the hairpin adaptor has an extendable 3' end.
  • extension of the 3 ' end allows copying of the adapted fragments.
  • the method may include a step of using a nucleic acid polymerase to extend the 3 '-end of the oligonucleotide adaptor to produce a copy of the fragments having the hairpin adaptor attached thereto.
  • the first single stranded oligonucleotides are fragments derived from a nucleic acid sample.
  • the fragments can be obtained using chemical or enzymatic cleavage of the sample.
  • the fragments can be obtained using bisulfite treatment.
  • the fragments can be obtained as products of the cross-link reversal chemistry employed during DNA extraction from FFPE fixed tissue.
  • the fragments can be obtained as products of aged and heavily degraded samples.
  • the locus and extension may give rise to fragments of having say 100-200 bases of unknown sequence. Attachment of the adaptor, followed by extension gives rise to products having say 100-200 base pairs of double stranded sequence, linked at one end by a loop of single stranded sequence from the hairpin adaptor.
  • the extension should give rise to a blunt-ended product, including the complement of the locus primer and any universal region attached thereto.
  • a further adaptor can be attached to the end of the extended copy.
  • the universal region of the locus primers, and/or the further adaptor results in products having known regions at either end and copies of the nucleic acid sample fragments centrally between the known ends.
  • the copied fragments can therefore be amplified using primers complementary to the two adaptors/universal regions or copies thereof.
  • a sequencing step can be carried out on the amplified mixture.
  • each member of the population contains a common universal sequence and one of a plurality of locus-specific regions wherein each locus specific region contains only the nucleic acid bases A, G and T such that the locus is complementary to copies of a sample treated with bisulfite.
  • Bisulfite treatment results in a sample having very few residual C bases. Copies of the fragments have very few G bases, and so the primers can be free of the C nucleotide.
  • the primer can contain C bases in the universal region, but the primer regions which are locus specific and which vary between different members of the population can be C-firee (i.e. contain only A, G and T).
  • the primers can be designed according to any one or more of the following criteria. Approximately 50 bases in length; for example 40-60 bases. Melting temperature approximately 55°C, for example 50-60°C. No CpG dinucleotides in the 3 '-most 20 bases at the head of the loci primer. 3 or fewer CpG dinucleotides in the 5 '-most 30 bases at the tail of the loci primer.
  • kits for use in selecting fragments from a nucleic acid sample comprising a plurality of locus specific primers, a hairpin adaptor and a single primer complementary to a portion of the hairpin adaptor or a copy thereof.
  • the hairpin polynucleotide may have a triphosphate moiety at the 5 '-end.
  • the kit may further include a terminal transferase.
  • Other components, including instructions, can be added to the kit as described herein.
  • the kit may contain the plurality of locus specific primers having only A, G and T bases, as described above.
  • the kit may be suitable for labelling both DNA and RNA fragments.
  • the kit may contain two template independent polymerases.
  • the kit may contain both terminal transferase and polyadenylate polymerase (PAP) or polyU polymerase (PUP).
  • PAP polyadenylate polymerase
  • PUP polyU polymerase
  • the method herein describes a number of features different to prior art methods of nucleic acid selection. These include:
  • the nucleic sample is either fully or partially single stranded or is fragmented as a first part of the process.
  • the sample may be the native biological sample (for example raw genomic DNA, RNA, micro-RNA).
  • the sample has not undergone any amplification or adaptor attachment steps prior to fragmentation, so the potential for selection or amplification bias is reduced.
  • the nucleic acid molecules or sample fragments are single stranded, and undergo a step of attaching an adaptor selectively at one end to produce mono-adapted fragments.
  • Such mono- adapted fragments have one end of fixed, known sequence from the adaptor, and one variable, unknown end from the sample mixture.
  • Bisulfite treatment of a sample which already contains the adaptors means that the majority of the samples which contain adaptors at both ends are lost. The majority of the sample fragments will not contain two known ends, and therefore can not be subsequently amplified. It is therefore advantageous that the adaptors are attached after the fragmentation step.
  • the locus specific primers are hybridised directly to the mono-adapted fragmented sample. There are no amplification steps such as whole genome amplification prior to selection. There is therefore no possibility of amplification bias, or mis- priming to portions of any amplification primers, as no amplification primers are present prior to the point the primers are hybridised.
  • sample can be amplified prior to or as part of the selection in order to obtain sufficient material to select.
  • a method comprising attaching a single stranded adaptor to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono-adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to the adaptor.
  • a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
  • amplification step prior to the hybridisation of the locus primers includes a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
  • each extension product includes a copy of the first adaptor
  • a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising the steps of; a. taking a population of single stranded nucleic acid molecules;
  • each extension product includes a copy of the adaptor; e. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps d and e occur simultaneously or sequentially; and
  • a method comprising attaching a hairpin to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono-adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to a portion of the hairpin.
  • the method may include the steps of
  • a fragmenting the sample to produce a population of single stranded sample fragments; b. attaching an oligonucleotide adaptor to the 3 ' end of each of the strands in the sample wherein the adaptor is a hairpin containing a cleavage site;
  • steps a-f or a-i should be carried out in the order shown.
  • the primers are loci specific, meaning that they hybridise to a single location in the nucleic acid sample of interest.
  • the primers can be of different lengths in order to normalise melting temperatures.
  • the loci specific portion can be a length to ensure accurate hybridisation to a single location in the sample.
  • the loci specific portion can be for example 30-60 bases in length.
  • the adaptor region can be a further 10-40 bases in length.
  • the primers can be between 40-100 bases in length.
  • the loci region can be around 50 bases in length, and the adaptor can be for example 20 bases in length
  • Fragmentation of the sample can be caused by bisulfite treatment.
  • Bisulfite treatment converts non-methylated cytosine bases to uracil bases.
  • the locus specific primers can be complementary to a copy of these fragments (thus have the same sequence as regions of the fragments).
  • the locus specific region can contain only the nucleic acid bases A, G and T such that the locus is complementary to copies of (i.e. the same as region of) a sample treated with bisulfite.
  • the locus can contain all four bases.
  • the population of locus specific primers can contain both primers with A, G and T and primers with A, C, G and T. Primers having both C and A in equivalent positions can be used to identify specific methylation sites if required. Alternatively the methylation sites of interest may be in the extension region adjacent to the 3 ' end of the primer.
  • the locus specific primers can be chosen bioinformatically to cover regions of interest. For bisulfite treated samples, the primers can be chosen to locate near to CpG islands or potential methylation sites of interest. Extension of the primers can be chosen to read through one or more CpG, CpH, CpA or CpN locations.
  • each primer can consist of A, G and T bases in the locus specific region, and A, C, G and T bases in the universal region.
  • the universal region should be to the 5' side of the locus region such that 3 ' end of the primer is hybridised and suitable for extension.
  • Each locus can be close to a CpG location in the sample.
  • the phosphate groups can be removed, for example using a phosphatase or a kinase such as polynucleotide kinase (PNK).
  • PNK polynucleotide kinase
  • the extended hairpin upon extension with NTP' s, is the only oligonucleotide with a 3 '-hydroxyl moiety.
  • the hairpin can contain a cleavage site, cleavage of which can be used to make the sample single stranded upon denaturation of the cleaved samples.
  • the cleavage site may be, for example, one or more uracil bases.
  • the extended strand of the hairpin is the correct orientation (i.e. adapted at the 5' end) for a locus specific primer to hybridise and extend, the original sample fragments having the remainder of the hairpin at the 3' end are lost from the mixture as they are not amplified.
  • the length of the extension products is generally determined by the length of the fragments and the position of hybridisation. The extension will continue either until the end of the adaptor or hairpin adaptor is reached or a site is reached which does not permit incorporation, either for example because the nucleoside is abasic, or a limited selection of nucleoside triphosphates (less than 4) is used.
  • the length of the extension is not particularly significant, and can be for example on average 10-200 fragment bases per molecule plus the length of the hairpin adaptor.
  • the extension reaction may be carried out using a nucleic acid polymerase.
  • the extension reaction may be carried out using nucleotide triphosphates.
  • the extension can be carried out using four nucleotide triphosphates.
  • the extension process can be carried out via thermocycling such that multiple extension products are derived from each fragment.
  • Alternative isothermal processes of denaturation and re-annealing can also be carried out such that the extended products are removed, and fresh primers are hybridised.
  • the location of the base(s) being analysed is determined by the identity of a specific primer. More than one base can be analysed per extended primer.
  • the extension products an be used to analyse nucleotide changes, for example single nucleotide polymorphisms (SNP's) or methylation status, for example whether C cases have been converted to U upon bisulfite treatment.
  • SNP's single nucleotide polymorphisms
  • methylation status for example whether C cases have been converted to U upon bisulfite treatment.
  • the multiple base extensions can give information in relation to deletions or insertions of one or more bases.
  • loci specific or locus specific means that the primer hybridises selectively to a single location, or loci, in the nucleic acid sample.
  • the method can be carried out using a large number of different primers which can be pooled prior to hybridisation with the sample. There is no particular limit to the number of loci analysed. The method can work robustly with millions of loci targeting probes.
  • the population of locus specific primers can be at least 50 different sequences.
  • the population of locus specific primers can be at least 100 different sequences.
  • the population of locus specific primers can be at least 1000 different sequences.
  • the population of locus specific primers can be at least 10,000 different sequences.
  • the method can be carried out such that at least 100,000 locations (primers) are analysed per sample.
  • the method can be carried out such that at least 200,000 locations (primers) are analysed per sample.
  • the method can be carried out such that at least 300,000 locations (primers) are analysed per sample.
  • the method can be carried out such that at least 400,000 locations (primers) are analysed per sample.
  • the population of locus specific primers can be at least 1,000,000 different sequences.
  • the location(s) in the sample to be identified is/are in the vicinity of a unique primer.
  • the base(s) to be interrogated should be at the 3 '- side of the primer such that nucleotides can be incorporated complementary to the base(s) being analysed.
  • the base(s) to be interrogated may be immediately 3 ' of the primer such that the first incorporation is being studied, or may be within 2-30 bases of the end of the primer.
  • the interrogated bases can be in different locations for different primers.
  • the primers can be of different lengths in order to normalise melting temperatures.
  • the primers can be between 15-100 bases in length. Primers having higher levels of A and T bases can be longer than primers having higher levels of C and G bases.
  • the primers should be specifically hybridised at the temperature required for the polymerase extension.
  • the primers can be extended using a suitable nucleic acid polymerase.
  • the polymerase may be a DNA polymerase.
  • the polymerase may be active at room temperature, or may be a thermophilic polymerase.
  • the temperature of the extension reaction can be chosen based on the desired specificity of the primer hybridisation reactions and the length of the primer sequences.
  • the temperature of the extension reaction can be for example between 30-72 °C.
  • the temperature of the extension reaction can be for example between 50-72 °C.
  • the nucleic acid samples are prepared as single stranded, which are then hybridised with the primers.
  • the sample can contain a single adaptor at the 5' end prior to primer hybridisation.
  • the hybridisation can be carried out by heating a population of double stranded fragments, thus melting them to be single strands, and allowing the mixture to cool.
  • the sample can be prepared as a single stranded sample without heat denaturing.
  • the nucleic acid fragments in the sample will be single stranded.
  • the fragments can be made single stranded by other chemical treatments, for example exposure to hydroxide.
  • the double stranded fragments may be made single stranded by heating or chemical treatment, either in the presence of the locus specific primers or prior to addition of the primers.
  • universal adaptors may be attached to the ends.
  • the universal adaptors allow amplification using a single pair of primers complementary to the adaptor.
  • Many methods exist for the preparation of samples of double-stranded DNA, for example for sequencing e.g. Illumina TruSeq and NextEra, 454, NEBnext, Life Technologies etc).
  • the sample may be processed in double stranded form or the sample may be treated (for example using heat denaturation) to give rise to a single stranded form.
  • the adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded or double stranded.
  • the double stranded adaptor has at least one overhanging single stranded region, and may have two or three overhanging single stranded regions.
  • the overhang serves to act to hybridise to the end of the extended locus primers to which the adaptor is to be attached, and acts as a site which can undergo polymerase extension to make the attached single stranded extended locus primers double stranded.
  • the adaptors can be 'forked' adaptors having regions which are non-complementary as well as regions which are complementary.
  • the attachment of the adaptor can be carried out using a template dependent polymerase. Any polymerase suitable for the incorporation of a nucleotide triphosphate can be used.
  • the adaptor can be thought of as a nucleotide triphosphate attached to an oligonucleotide duplex. Thus the adaptor carries its own template.
  • the adaptor or oligonucleotide 5 '-triphosphate adaptor may have a region of self- complementarity such that the second oligonucleotide may take the form of a hairpin.
  • the hairpin may have 3 '-overhang suitable for polymerase extension.
  • the term single stranded therefore includes a single strand which is in part single stranded, and in part double stranded at certain temperatures, but which can be made single stranded by increasing the temperature.
  • the adaptor may have one or more regions for indexing such that different oligonucleotides can be attached to different samples, thereby allowing sample pooling.
  • the adaptor may have one or more modifications which allow site specific strand cleavage.
  • the adaptor may have one or more uracil bases, thereby allowing site specific cleavage using enzyme treatment.
  • the locus specific primers or the adaptor may be attached to a solid surface, or may contain a modification allowing for subsequent immobilisation or capture.
  • the adaptor attachment may be carried out on a solid support, or the joined products may be captured onto a surface after joining.
  • the locus specific primers or the adaptors may carry a moiety for surface capture, for example a biotin moiety.
  • the attachment may be covalent.
  • the primers may be immobilised on a solid support, and used to capture the single stranded mono-adapted oligonucleotide fragments.
  • the adaptors may be immobilised on a solid support, and used to capture the fragments as the first stage of the process.
  • the locus specific primers or the adaptor may be DNA, RNA or a mixture thereof. Where the adaptor contains two strands, one strand may be DNA and one strand may be RNA.
  • An aspect of the invention described herein provides a method of joining two oligonucleotides using a template independent nucleic acid polymerase enzyme such as terminal deoxynucleotidyl transferase (TdT).
  • Terminal deoxynucleotidyl transferase also known as DNA nucleotidylexotransferase (DNTT) or terminal transferase
  • DNTT DNA nucleotidylexotransferase
  • DNTT DNA nucleotidylexotransferase
  • TdT typically adds nucleotide 5 -triphosphates onto the 3 '-hydroxyl of a single stranded first oligonucleotide sequence.
  • the invention as described herein uses a second oligonucleotide carrying a 5 '-triphosphosphate which can be attached to the first oligonucleotide sequence, thus enabling two single stranded oligonucleotides to be joined together, catalysed by TdT.
  • the enzyme can be used to link two oligonucleotide strands, rather than simply adding individual nucleotides.
  • Polyadenylate polymerase is an enzyme involved in the formation of the polyadenylate tail of the 3' end of mRNA.
  • PAP uses adenosine triphosphate (ATP) to add adenosine nucleotides to the 3 ' end of an RNA strand.
  • ATP adenosine triphosphate
  • the enzyme works in a template independent manner.
  • a further aspect of the invention involves the use of PAP to join two oligonucleotides. In the use of PAP, one or more of the oligonucleotides may be RNA rather than DNA.
  • poly(U) polymerase catalyzes the template independent addition of UMP from UTP or AMP from ATP to the 3 ' end of RNA.
  • template independent nucleic acid polymerase enzyme includes any polymerase which acts without requiring a nucleic acid template.
  • template independent nucleic acid polymerase enzyme includes terminal deoxynucleotidyl transferase (TdT), Polyadenylate polymerase (PAP) and poly(U) polymerase (PUP).
  • the template independent nucleic acid polymerase enzyme can be PAP.
  • the template independent nucleic acid polymerase enzyme can be terminal transferase.
  • the oligonucleotides joined can be RNA or DNA, or a combination of both RNA and DNA.
  • the oligonucleotides can contain one or more modified backbone residues, modified sugar residues or modified nucleotide bases.
  • Template independent nucleic acid polymerase enzymes may be sensitive to the bulk of substituents attached to the ribose 3 '- position.
  • the standard substrates for these enzymes are nucleotide triphosphates in which the ribose 3 '- position is a hydroxyl group.
  • the enzyme may be engineered using suitable amino acid substitutions to accommodate any increase in steric bulk.
  • the term template independent nucleic acid polymerase enzyme therefore includes non-naturally occurring (engineered) enzymes.
  • template independent nucleic acid polymerase enzyme includes modified versions of terminal transferase or PAP. Terminal transferase, PUP or PAP may be obtained from commercial sources (e.g. New England Biolabs).
  • the method described herein adds a single stranded oligonucleotide with a 3 ' hydroxyl to a single stranded oligonucleotide with a 5 '-triphosphate moiety.
  • the triphosphate moiety can be attached directly to the 5'-hydroxyl of the second oligonucleotide.
  • the 5 '- oligonucleotide triphosphate can react directly with the 3 '-hydroxyl group of the first oligonucleotide to form a single stranded oligonucleotide containing the first and second sequences linked together via a standard 'natural' phosphomonoester moiety.
  • Such an oligonucleotide can be copied using a polymerase as there are no unnatural linking groups between the first and second oligonucleotides.
  • the use of engineered template independent polymerase enzymes may increase the tolerance for steric bulk at the 3 '-position of the triphosphate nucleotide, and hence allow the use of oligonucleotide strands attached directly to the 3 '-hydroxyl of a nucleotide triphosphate.
  • the triphosphate can be attached through a linker moiety.
  • Linker moieties can be any functionality attached to the terminal 5'hydroxyl of the oligonucleotide strand.
  • the linker moiety can include one or more phosphate groups.
  • the linker may contain a ribose or deoxyribose moiety.
  • the linker may contain one or more further nucleotides.
  • the nucleotides, or the ribose or deoxyribose moieties may be further substituted.
  • the linker may contain a ribose or deoxyribose moiety in which the oligonucleotide is attached to the 2- position of the ribose.
  • the linker may contain a nucleotide in which the remainder of the oligonucleotide is attached via the nucleotide base.
  • the linker may employ one or more carbon, oxygen, nitrogen or phosphorus atoms.
  • the linker acts merely to attach the functional triphosphate moiety to the remainder of the oligonucleotide.
  • the joined oligonucleotides may be copied using a nucleic acid polymerase.
  • the linker should be able to permit a nucleotide polymerase to bridge though the linker in order to copy the strands after joining.
  • the action of the polymerase may be enhanced by using a hybridised primer which can bridge across the linker region.
  • the primer can be designed with a suitable length of sequence to space across the linker region.
  • the sequence can be degenerate/random or simply be a suitable length of known sequence in order to bridge across any gap caused by the linker region.
  • the length of sequence used to bridge the gap can be designed depending on the choice of linker.
  • the sequence can be used as a tag for individual fragments.
  • the tag can be used to assess the level of bias introduced by any amplification reactions. If the tags are say 6 mers of random sequences, there at 4 ⁇ 6 (4096) different variants of different sequence. From a population of fragments from a biological sample, it is highly unlikely that two fragments of the same 'biological' sequence will be joined to a tag with the same 'tag' sequence. Therefore any examples where the fragments and tag are over-represented in the sequencing reaction occur because the particular individual fragment is over-amplified during the PCR reaction when compared to other fragments in the population. Thus the use of 'tags' of variable sequence can be used to help normalise the effects of amplification variability.
  • the tags can also be used to help identify sequences from different sources. If adaptors are used with different sequences for different sources of biological materials, then the different sources can be pooled but still identified via the tag when the tags are sequenced. Thus the disclosure herein includes the use of two or more different populations of adaptors for the multiplexing of the analysis of different samples. Disclosed herein therefore are kits containing two or more adaptors of different sequence.
  • the oligonucleotide with the 5 '-triphosphate may be blocked at the 3 ' end to prevent self joining.
  • the blocking moiety may be a phosphate group or a similar moiety.
  • the 3 ' end may be a dideoxy nucleotide with no 3'-OH group.
  • any blocking group can be removed.
  • the methods can include the step of treatment with a suitable kinase to remove a 3'- phosphate moiety.
  • the kinases may be PNK or any suitable kinase.
  • the oligonucleotide with the 5 '-triphosphate may be produced chemically or enzymatically.
  • a suitable nucleotide 5 '-triphosphate may be chemically coupled to a suitable oligonucleotide using suitable chemical couplings.
  • the nucleotide triphosphate may contain an azido (N 3 ) group and the oligonucleotide may contain an alkyne group such as DBCO.
  • a suitable oligonucleotide monophosphate may be turned into a triphosphate either chemically or enzymatically.
  • the sequence of the 5 '-triphosphate adaptor oligonucleotide depends on the specific application and suitable adaptor oligonucleotides may be designed using known techniques.
  • a suitable adaptor oligonucleotide may, for example, consist of 20 to 100 nucleotides.
  • the sequence of the adaptor may be selected to be complementary to a suitable amplification/extension primer.
  • the adaptor may contain a region of monobase sequence such as for example poly A or poly T.
  • the adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded or double stranded or a combination thereof.
  • the double stranded adaptor has at least one overhanging single stranded region, and may have two or three overhanging single stranded regions.
  • the overhang serves to act to hybridise to the end of the extended locus primers to which the adaptor is to be attached, and acts as a site which can undergo polymerase extension to make the attached single stranded extended locus primers double stranded.
  • the adaptors can be 'forked' adaptors having regions which are non-complementary as well as regions which are complementary.
  • Attachment of the 5 '-triphosphate oligonucleotide may give rise to a join which is not a natural phosphodiester linkage. Such joins may not be substrates for nucleic acid polymerases.
  • the use of 3 '-overhangs, either as hairpins or double stranded adaptors is advantageous as the linking region can be 'bridged' using an oligonucleotide primer sequence which is internal or part of the adaptor.
  • Hybridisation of a primer suitable for extension would also require such an internal spacer, and this lowers the affinity and specificity of the primer hybridisation, whereas no such issues arise where the adaptor has an 'internal' primer which is already hybridised (or in the case of hairpins integral).
  • the attachment of a single 'hairpin' which can be used as both the known end and the extendable primer when preparing a library is therefore advantageous over the attachment of a single known end followed by the hybridisation of a second primer.
  • the pre-formed, or intramolecular hybridisation spans the unnatural join, and allows efficient extension.
  • the oligonucleotide adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded in portions, and have a region of self-complementarity such that the second oligonucleotide may take the form of a hairpin.
  • the adaptor may take the form of a hairpin having a single stranded region and a region of self-complementary double stranded sequence.
  • the hairpin may have 3 '-overhang suitable for polymerase extension. The overhang may stretch across the triphosphate 'linker' region at the 5' end, thus avoiding any issues relating the presence of the 5 '-modification required for TDT incorporation.
  • the self- complementary double stranded portion may be from 5-20 base pairs in length.
  • the overhang may be from 1-10 bases in length.
  • the overhang may contain one or more degenerate bases.
  • the sequence may contain a mixture of bases A, C and T at each position (symbolised as H (not G)). H may be used in cases where the sample is bisulfite treated, and thus does not contain any C bases to which the G would be complementary.
  • the overhang may consist of 1-10 H bases.
  • the overhang may be 2-8 bases, which may be H.
  • the overhang may have a 3 '-phosphate.
  • the overhang may have a 3 '-OH.
  • the overhang could be a known, standard sequence.
  • the oligonucleotide adaptor may have one or more regions for indexing such that different oligonucleotides can be attached to different samples, thereby allowing sample pooling. Alternatively the region for indexing may be located as part of the locus primer oligonucleotides.
  • the adaptor oligonucleotide may have one or more modifications which allow site specific strand cleavage.
  • the adaptor oligonucleotide may have one or more uracil bases, thereby allowing site specific cleavage using enzyme treatment with UDG and an endonuclease.
  • the term adaptor fragment is used to refer to the portion of the adaptor remaining after the cleavage step.
  • the fragment remaining may be only one base shorter than the original adaptor.
  • the cleavage site may be more central such that a portion of the adaptor remains on the original fragment strand after cleavage, as well as a portion of the adaptor being attached to the desired copy of the fragment.
  • Copies of fragments may be produced by extending the 3 '-end of the attached double stranded adaptor or hairpin.
  • the extension of the hairpin produces an extended hairpin.
  • the extended hairpin can also be described as a double stranded nucleic acid having one end joined. Upon denaturation, the extended hairpin becomes a single stranded molecule, but the length of the double stranded portion (for example at least 100 base pairs) means that the sample rapidly hybridises to form the extended hairpin.
  • ligases Alternative methods of attaching adaptors include the use of ligases.
  • the sample may be left double stranded.
  • the sample may be treated to give an overhanging base complementary to an overhang on an adaptor.
  • the blunt ended duplex may be treated to add for example a single 3 ' nucleotide which can then act as an end complementary to an adaptor.
  • the ligation may be blunt ended or cohesive.
  • the method needs to be chosen to avoid the formation of adaptor- adaptor ligation.
  • the ends of the strands are phosphates, and may thus be amenable to adaptors having both 3 ' and 5' phosphates.
  • a hairpin adaptor having a 3 ' phosphate and a 5' phosphate could ligate to the 3 '- hydroxyl of a double stranded nucleic acid, but not the complementary 5' phosphate.
  • the adaptor, having no free hydroxyl groups could not ligate to itself.
  • the desired product can be formed where the extended strand is attached to the adaptor, but the strand derived from the biological sample (having a 5 '-phosphate) remains unligated.
  • the 3' end of the hairpin may require deblocking. If the hairpin contains, for example, a 3 ' phosphate, the 3' phosphate can be removed (for example using a kinase such as PNK) and the released 3 '-OH can be used in an extension to copy the fragment.
  • a double stranded product linked at one end via the adaptor hairpin is produced.
  • the method may be used in order to prepare samples for nucleic acid sequencing.
  • the method may be used to sequence a population of synthetic oligonucleotides, for example for the purposes of quality control.
  • the first oligonucleotides may come from a population of nucleic acid molecules from a biological sample.
  • the population may be fragments of between 100-10000 nucleotides in length.
  • the fragments may be 200-1000 nucleotides in length.
  • the fragments may be of random variable sequence.
  • the order of bases in the sequence may be known, unknown, or partly known.
  • the fragments may come from treating a biological sample to obtain fragments of shorter length than exist in the naturally occurring sample.
  • the fragments may come from a random cleavage of longer strands.
  • the fragments may be derived from treating a nucleic acid sample with a chemical reagent (for example sodium bisulfite, acid or alkali, chemical denaturants such as formamide and urea) or enzyme (for example with a restriction endonuclease or other nuclease).
  • a chemical reagent for example sodium bisulfite, acid or alkali, chemical denaturants such as formamide and urea
  • enzyme for example with a restriction endonuclease or other nuclease.
  • Methods of the invention may be useful in preparing a targeted population of nucleic acid strands for sequencing, for example a population of bisulfite-treated single-stranded nucleic acid fragments.
  • Bisulfite treatment produces single-stranded nucleic acid fragments, typically of about 250-1000 nucleotides in length.
  • the sample may be fragmented using treatment with bisulfite by incubation with bisulfite ions (HSO 3 ) or metabisulfite ions (S 2 0 5 2 -).
  • bisulfite ions or metabisulfite ions to convert unmethylated cytosines in nucleic acids into uracil is standard in the art and suitable reagents and conditions are well known.
  • a population of DNA strands having one or more abasic sites may be produced, and thereby the sample fragmented, by subjecting a population of nucleic acid molecules to acid hydrolysis.
  • the population may be subjected to acid hydrolysis by incubation at an acidic pH (for example, pH 5) and elevated temperature (for example, greater than 70 °C).
  • acidic pH for example, pH 5
  • elevated temperature for example, greater than 70 °C
  • a proportion of the purine bases in the nucleic acid strands will be lost, to generate abasic sites.
  • the number of abasic sites formed depends on the pH, concentration of buffer, temperature and length of incubation.
  • a population of DNA strands having one or more abasic sites may be produced by treating the population of nucleic acid strands with uracil-DNA glycosylase (UDG).
  • the population may be treated with UDG by incubation with UDG at 37°C.
  • UDG excises uracil residues in the nucleic acid strands leaving abasic sites.
  • UDG may be obtained from commercial sources.
  • any sample containing uracil bases may be cleaved using uracil-DNA glycosylase (UDG).
  • the population may be treated with UDG by incubation with UDG at 37 °C.
  • UDG excises uracil residues in the nucleic acid strands leaving abasic sites.
  • UDG may be obtained from commercial sources. Abasic sites can be cleaved using an endonuclease mixture. Mixtures of enzymes suitable for cleaving nucleic acid strands containing uracil bases are commercially available and standard in the art.
  • the population of nucleic acid molecules may be a sample of DNA or RNA, for example a genomic DNA sample.
  • Suitable DNA and RNA samples may be obtained or isolated from a sample of cells, for example, mammalian cells such as human cells or tissue samples, such as biopsies.
  • the sample may be obtained from a formalin fixed parafin embedded (FFPE) tissue sample.
  • FFPE formalin fixed parafin embedded
  • Suitable cells include somatic and germ-line cells.
  • the targeting methods described herein are particularly advantageous where the amount of sample is limited.
  • the sample may be an ancient nucleic acid sample.
  • the sample may be an isolate from a cell free nucleic acid such as circulating cell free DNA.
  • the sample may be a maternal sample where the presence of fetal nucleic acids are detected.
  • the sample may be aiming to detect the presence of circulating tumour cells.
  • the sample may be derived from blood or other biological source such as cerebrospinal fluids.
  • the method may be advantageous in the simultaneous analysis of both DNA molecules and RNA molecules from a sample.
  • the addition of the adaptor to a sample of single stranded DNA means that adaptors can be added to both DNA and RNA whilst the two species remains in the same sample.
  • An adaptor can be added to the end of both RNA and DNA in a mix of the two nucleic acids (for example in cell extract where neither the RNA nor the DNA has been purified away from its partner).
  • Such a method allows mapping of the epigenome alongside the transcriptome, and having a method that allows us to do this in parallel in the same reaction would be advantageous.
  • the adaptors can be of partially different sequence such that the identify of the molecule can be identified as being DNA or RNA.
  • RNA adaptor is labelled with index Y and a DNA adaptor with index Z adding them in a mix with TdT and PAP
  • the DNA adaptor with index Z specifically adds to the DNA fragments
  • the RNA adaptor with index Y adds specifically to the RNA fragments at the same time in a single reaction.
  • the method of the invention allows the adaptation and selection of both DNA and RNA in a single composition.
  • the population may be a diverse population of nucleic acid molecules, for example a library, such as a whole genome library or a loci specific library.
  • Methods of the invention may be useful in producing populations of mono-adapted single stranded nucleic acid fragments i.e. nucleic acid strands having an adaptor oligonucleotide attached to their 3 ' termini.
  • populations of 3 ' adapted single stranded nucleic acid fragments may be used directly for sequencing and/or amplification.
  • the sequence of the adaptor oligonucleotide may be entirely known, or may include a variable region.
  • the sequence may include a universal sequence such that each joined sequence has a common 'adaptor' sequence attached to one end.
  • the attachment of an adaptor or a fragment thereof to one end of a pool of fragments of variable sequence means that copies of the variable sequences can be produced using a single 'extension' primer.
  • the targeted fragments may be analysed by hybridising the fragments onto a solid support carrying an array of primers complementary to the adaptor oligonucleotide sequence.
  • the joined fragments may have an adaptor modification at the 3'- end which allows attachment to a solid support.
  • the methods disclosed may further include the step of producing one or more copies of the locus primer extended oligonucleotides.
  • the methods may include producing multiple copies of each of the targeted sequences.
  • the copies may be made by hybridising a primer sequence opposite a universal sequence on the oligonucleotide adaptor fragment sequence, and using a nucleic acid polymerase to synthesise a complementary copy of the first single stranded sequences.
  • the production of the complementary copy provides a double stranded polynucleotide.
  • the hairpin can contain a cleavage site (for example a uracil nucleotide). Upon cleavage, the hairpin becomes two strands rather than one, and is no longer a hairpin. Thus the sample becomes an adapted known end (from the adaptor fragment), and unknown region from the sample of interest (the sequence of which can be determined).
  • the fragments are selected using the locus specific primers, which can contain a universal sequence. After extension of the locus specific primers, the double stranded polynucleotides can be amplified, either using further copies of the locus primers, or using primers complementary to both known ends.
  • Double stranded polynucleotides may be made circular by attaching the ends together. This may be useful in the generation of circular nucleic acid constructs and plasmids or in the preparation of samples for sequencing using platforms that employ circular templates (e.g. PacBio SMRT sequencing).
  • populations of circularised 3 ' adapted nucleic acid fragments produced as described herein may be denatured and subjected to rolling circle or whole genome amplification. Amplification of circular fragments can be carried out using primers complementary to two regions of the single adaptor sequence.
  • a second adaptor may be attached to a product after a second extension.
  • the second adaptor may comprise a self-complementary double stranded region (i.e. a hairpin).
  • kits and components for carrying out the invention.
  • a kit for use in preparing a nucleic acid sample comprising a hairpin adaptor polynucleotide having a triphosphate moiety at the 5'-end, a population of locus specific primers and a single primer complementary to a portion of the hairpin adaptor or a copy thereof.
  • the kit may contain a nucleotide 5-triphosphate adaptor having any of the features described herein.
  • the two or more different sequences may include a fixed sequence capable of hybridising to an extension primer, and a variable sequence which acts as a tag to identify the adaptor (and hence the identify of the sample to which the adaptors are attached).
  • the kit may be suitable for labelling both DNA and RNA fragments.
  • the kit may contain two template independent polymerases.
  • the kit may contain both terminal transferase and polyadenylate polymerase (PAP) or polyU polymerase (PUP).
  • each member of the population contains a common universal sequence and one of a plurality of locus- specific regions wherein each locus specific region contains only the nucleic acid bases A, G and T such that the locus is complementary to copies of a sample treated with bisulfite.
  • Such primers may be included in a kit with a polynucleotide having a triphosphate moiety at the 5'- end.
  • the primers may comprise 50 or more locus specific sequences in the population.
  • Suitable sequencing methods are well known in the art, and include Ulumina sequencing, pyrosequencing (for example 454 sequencing) or Ion Torrent sequencing from Life TechnologiesTM).
  • Populations of nucleic acid molecules with a 3 ' adaptor oligonucleotide and a 5' adaptor oligonucleotide may be sequenced directly.
  • the sequences of the first and second adaptor oligonucleotides may be specific for a sequencing platform.
  • they may be complementary to the flowcell or device on which sequencing is to be performed. This may allow the sequencing of the population of nucleic acid fragments without the need for further amplification and/or adaptation.
  • the first and second adaptor sequences are different.
  • the adaptor sequences are not found within the human genome, or other sample genome of interest.
  • the nucleic acid strands in the population to be sequenced may have the same first adaptor sequence at their 3' ends and the same second adaptor sequence at their 5' ends i.e. all of the fragments in the population may be flanked by the same pair of adaptor sequences.
  • Suitable adaptor oligonucleotides for the production of nucleic acid strands for sequencing may include a region that is complementary to the universal primers on the solid support (e.g. a flowcell or bead) and a region that is complementary to universal sequencing primers (i.e. which when annealed to the adaptor oligonucleotide and extended allows the sequence of the nucleic acid molecule to be read).
  • Suitable nucleotide sequences for these interactions are well known in the art and depend on the sequencing platform to be employed. Suitable sequencing platforms include Illumina TruSeq, LifeTech IonTorrent, Roche 454 and PacBio RS.
  • the sequences of the first and second adaptor oligonucleotides may comprise a sequence that hybridises to complementary primers immobilised on the solid support (e.g. a 20-30 nucleotides); a sequence that hybridises to sequencing primer (e.g. a 30-40 nucleotides) and a unique index sequence (e.g. 6-10 nucleotides).
  • Suitable first and second adaptor oligonucleotides may be 56-80 nucleotides in length.
  • the adaptor may be for example 5-20 bases of a first complementary sequence, a single stranded loop comprising a sequence that hybridises to the solid support and the sequencing primer (e.g.
  • the hairpin constructs may be 60 to 100 nucleotides or more in length.
  • the nucleic acid molecules may be purified by any convenient technique. Following preparation, the population of nucleic acid molecules may be provided in a suitable form for further treatment as described herein. For example, the population of nucleic acid molecules may be in aqueous solution in the absence of buffers before treatment as described herein. In other embodiments, populations of nucleic acid molecules with a 3 ' adaptor oligonucleotide and optionally a 5' adaptor oligonucleotide, may be further adapted and/or amplified as required, for example for a specific application or sequencing platform.
  • the nucleic acid strands in the population may have the same first adaptor sequence at their 3' ends and the same second adaptor sequence at their 5' ends i.e. all of the fragments in the population may be flanked by the same pair of adaptors, as described above.
  • This allows the same pair of amplification primers to amplify all of the strands in the population and avoids the need for multiplex amplification reactions using complex sets of primer pairs, which are susceptible to mis-priming and the amplification of artefacts.
  • Suitable first and second amplification primers may be 20-25 nucleotides in length and may be designed and synthesised using standard techniques.
  • a first amplification primer may hybridise to the first adaptor sequence i.e. the first amplification primer may comprise a nucleotide sequence complementary to the first adaptor oligonucleotide; and a second amplification primer may hybridises to the complement of second adaptor sequence i.e. the second amplification primer may comprise the nucleotide sequence of the second adaptor oligonucleotide or to the universal sequence on the locus specific primers.
  • a first amplification primer may hybridise to the complement of first adaptor sequence i.e.
  • the first amplification primer may comprise a nucleotide sequence of the first adaptor oligonucleotide; and a second amplification primer may hybridise to the second adaptor sequence i.e. the second amplification primer may comprise the nucleotide sequence of the second adaptor oligonucleotide or the universal sequence on the locus specific primers.
  • the first and second amplification primers may incorporate additional sequences.
  • Additional sequences may include index sequences to allow identification of the amplification products during multiplex sequencing, or further adaptor sequences to allow sequencing of the strands using a specfic sequencing platform.
  • Figure 1 is an IVG screenshot showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions.
  • the bottom track shown in blue indicates the expected loci primer annealing positions.
  • Figures 2 and 3 are IVG screenshots showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions.
  • the bottom track shown in blue indicates the expected loci primer annealing positions.
  • Figure 4 shows a schematic of the method. An optional amplification can be carried out using a second adaptor prior to the selection using the locus specific primers. Methods with and without the optional amplification are shown.
  • Figures 5 and 6 are IVG screenshots showing read alignment and read coverage from two regions of the human genome generated using the method of example 3 (GAPDH region: Chrl2:6,53,000-6,541,000 and NANOG region: Chrl2:7,786,763-7,800,358) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions.
  • the bottom track shown in blue indicates the expected loci primer annealing positions.
  • Figure 7 and 8 are IVG screenshots showing read alignment and read coverage from two regions of the human genome generated using the method of example 4 (EGFR region: Chr7:55,130,000-55, 133,987and CLEC12A region: Chrl2:9,969,868-9,972, 145) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions.
  • the bottom track shown in blue indicates the expected loci primer annealing positions.
  • EXAMPLE 1 Preparation of loci targeting-extension libraries from native E.coli genomic DNA using post-hybridisation capture and cleanup prior to amplification
  • Step 1 Probe design
  • a pool of 2000 ssDNA probes were designed to be complementary to the top and bottom strands of native E.coli genome (MG1655). Probes were 64-84 nucleotides long with 30-50 nucleotides being complementary to the target loci distributed randomly in the E.coli genome. All probes had Tm between 55 °C ⁇ 5 °C and were labelled with Biotin at the 5' end.
  • the final probes include a 5' adaptor sequence to enable compatibility with Illumina sequencing technology (5'-GTGACTGGAGTTCAGACGTGTGCTCTTCCTATCT).
  • Step 3 End repair
  • Fragmented DNA 500 - 1000 ng was heat denatured at 95 °C for 3 minutes, and snap cooled on ice.
  • the DNA ends were repaired with 20 units of T4 Polynucleotide Kinase (PNK, Enzymatics Y9040L) in 20 of lx Addition Buffer (lOOmM Tris-acetate, 1.25 mM CoAc 2 , 0.125 mg/mL BSA, pH 6.6). The reaction was incubated at 37 °C for 20 minutes, and then heat denatured at 95 °C for 3 minutes.
  • PNK Polynucleotide Kinase
  • the triphosphate hairpin adaptor was prepared as follows: the universal short hairpin oligo (1250 pmol, Biomers GMBH) with a 5'-DBCO modification was reacted with an azido-3'- deoxyadenosine-5 '-triphosphate (1000 pmol, JenaBiosciences) for 2 hours at 10 °C in 10 mM Tris-HCl (pH 7.0).
  • the sequence of universal short hairpin adaptor is listed in Table 1. An aliquot of this hairpin adaptor (40 pmol) and 20 units of Terminal Transferase (TdT, Enzymatics, P7070L) were then added to incorporate the hairpin adaptor to the 3 '-end of the DNA (37 °C for 30 minutes). Reaction was stopped with 50 mM EDTA.
  • Step 5 Magnetic bead purification
  • the products obtained were SPRI bead purified with 3 : 1 bead solution:DNA ratio (18 % PEG-8000, 1 M NaCl, 1 mM EDTA, 10 mM Tris-HCl (pH 8.0), and 0.1 % w/v Carboxy magnetic beads). Binding time was 10 minutes and washes performed with 80 % ethanol. DNA was eluted from the beads with ultra pure water.
  • Step 6 Complementary strand extension
  • the tagged DNA eluted from step 5 was mixed with the PNK/Klenow cocktail (10 units PNK, 5 units Klenow exo- (P7010L)), 30 nmol dNTP mix in lx Blue buffer (50 mM NaCl, 10 mM Tris-HCl, 10 mM MgCl 2 , 1 mM DTT pH 7.9) to dephosphorylate the 3' end of the attached hairpin adaptor, and to synthesis the full length, complementary bottom strand by extension from the annealed hairpin adaptor by Klenow (37 °C for 30 minutes, then 55 °C for 20 minutes).
  • the extended products were further SPRI bead purified with 2: 1 bead solution:DNA ratio. DNA was eluted from the beads with ultra pure water.
  • the extended DNA was incubated with Thermolabile UDG (2.5 units, Enzymatics, G5020L) at 37 °C for 30 min to remove original strand and linearize the hairpin in lx Reaction Buffer (70 mM Tris-HCl, 10 mM NaCl, 1 mM EDTA, 0.1 mg/mL BSA pH 8.0 @25 °C), and then heat denatured at 95 °C for 5 minutes.
  • DNA was purified using 3x SPRI beads as described in step 5, eluted from the beads with 10 ⁇ , ultra pure water and quantified by Qubit ssDNA assay kit. The linearized DNA was used for hybridization capture.
  • hybridization buffer (lOx SSPE, lOx Denhardt's, 10 mM EDTA, and 0.2 % SDS) was pre-warmed at 48 °C in a water bath.
  • Library pools containing -500 ng DNA and blocking oligo (Table 2) were heated at 95 °C for 5 minutes and then held at 48 °C for 5 minutes in a thermal cycler before they were added to pre-warmed hybridization buffer, followed by addition of probe solution preheated to 48 °C for 2 minutes. All hybridization reactions were incubated at 48 °C for -64 hours in a PCR block.
  • Step 10 Post-capture on-beads extension
  • Step 9 Beads from step 9 was resuspended in ultra pure H 2 0, extension from the hybridised probe by Klenow (5 units Klenow exo-, Enzymatics) was carried out (37 °C for 30 minutes) in lx Blue Buffer (50 mM NaCl. 10 mM Tris-HCl, 10 mM MgCl 2 , 1 mM DTT pH 7.9) in the presence of 30 nmol dNTP to synthesise the full length, complementary strand.
  • Klenow Klenow exo-, Enzymatics
  • Step 1 Post-extension enrichment (IDX PCR)
  • PCR primers used were NEBNext Multiplex Oligos for Illumina (New England BioLabs, E7335L and E7500L) see Table 6 for sequences.
  • Step 12 Next generation sequencing and data processing
  • Samples were pooled, denatured and loaded onto an Illumina MiSeq flowcell at 10 pM concentration.
  • the samples were paired-end sequenced (2x 75cycle) using MiSeq v3 SBS chemistry according to the manufacturers standard protocol.
  • Raw read fastq files were automatically demultiplexed into sample-specific bins by the MSC software. Each sample specific fastq file was analysed in the following manner:
  • Figure 1 is an IVG screenshot showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions.
  • the bottom track shown in blue indicates the expected loci primer annealing positions.
  • EXAMPLE 2 Preparation of loci targeting libraries from native E.coli genomic DNA without post-hybridisation capture and cleanup prior to PCR amplification
  • Step 1 Probe design
  • a pool of 2000 ssDNA probes were designed to be complementary to the top and bottom strands of native E.coli genome (MG1655). Probes were 64-84 nucleotides long with 30-50 nucleotides being complementary to the target loci distributed randomly in the E.coli genome. All probes had Tm between 55 °C ⁇ 5 °C and were labelled with Biotin at the 5' end.
  • the final probes include a 5' adaptor sequence to enable compatibility with Illumina sequencing technology (5'-GTGACTGGAGTTCAGACGTGTGCTCTTCCTATCT).
  • Step 2 DNA fragmentation Native E.coli genomic DNA (strain K-12 sub-strain MG1655) was sheared to the length of approximately 350 bp using E-220 Covaris sonicator following manufacturers instructions.
  • Step 3 End repair
  • Fragmented DNA 500 - 1000 ng was heat denatured at 95 °C for 3 minutes, and snap cooled on ice.
  • the DNA ends were repaired with 20 units of T4 Polynucleotide Kinase (PNK, Enzymatics Y9040L) in 20 of lx Addition Buffer (lOOmM Tris-acetate, 1.25 mM CoAc 2 , 0.125 mg/mL BSA, pH 6.6). The reaction was incubated at 37 °C for 20 minutes, and then heat denatured at 95 °C for 3 minutes.
  • PNK Polynucleotide Kinase
  • the triphosphate hairpin adaptor was prepared as follows: the universal short hairpin oligo (1250 pmol, Biomers GMBH) with a 5'-DBCO modification was reacted with an azido-3'- deoxyadenosine-5 '-triphosphate (1000 pmol, JenaBiosciences) for 2 hours at 10 °C in 10 mM Tris-HCl (pH 7.0).
  • the sequence of universal short hairpin adaptor is listed in Table 1. An aliquot of this hairpin adaptor (40 pmol) and 20 units of Terminal Transferase (TdT, Enzymatics, P7070L) were then added to incorporate the hairpin adaptor to the 3 '-end of the DNA (37 °C for 30 minutes). Reaction was stopped with 50 mM EDTA.
  • Step 5 Magnetic bead purification
  • the products obtained were SPRI bead purified with 3 : 1 bead solution:DNA ratio (18 % PEG-8000, 1 M NaCl, 1 mM EDTA, 10 mM Tris-HCl (pH 8.0), and 0.1 % w/v Carboxy magnetic beads). Binding times were 10 minutes and washes performed with 80 % ethanol. DNA was eluted from the beads with ultra pure water.
  • Step 6 Complementary strand extension
  • the tagged DNA eluted from step 5 was mixed with the PNK/Klenow cocktail (10 units PNK, 5 units Klenow exo- (P7010L)), 30 nmol dNTP mix in lx Blue buffer (50 mM NaCl, 10 mM Tris-HCl, 10 mM MgC12, 1 mM DTT pH 7.9) to dephosphorylate the 3' end of the attached hairpin adaptor, and to synthesis the full length, complementary bottom strand by extension from the annealed hairpin adaptor by Klenow (37 °C for 30 minutes, then 55 °C for 20 minutes).
  • the extended products were further SPRI bead purified with 2: 1 bead solution:DNA ratio. DNA was eluted from the beads with ultra pure water.
  • the extended DNA was incubated with Thermolabile UDG (2.5 units, Enzymatics, G5020L) at 37 °C for 30 min to remove original strand and linearize the hairpin in lx Reaction Buffer (70 mM Tris-HCl, 10 mM NaCl, 1 mM EDTA, 0.1 mg/mL BSA pH 8.0 @25 °C), and then heat denatured at 95 °C for 5 minutes.
  • DNA was purified using 3x SPRI beads as described in step 5, eluted from the beads with 10 ⁇ , ultra pure water and quantified by Qubit ssDNA assay kit.
  • the linearized DNA was used for probe hybridization capture and linear amplification.
  • Step 8 Probe hybridization and linear extension
  • Step 9 PCR on extended probes
  • PCR products obtained were purified with lx 18 % PEG Ampure XP beads according to manufacturers instructions. Samples were eluted from the beads in ultra pure water.
  • Step 1 1 Next generation sequencing and data processing
  • Samples were pooled, denatured and loaded onto an Illumina MiSeq flowcell at 10 pM concentration.
  • the samples were paired-end sequenced (2x 75cycle) using MiSeq v3 SBS chemistry according to the manufacturers standard protocol.
  • Raw read fastq files were automatically demultiplexed into sample-specific bins by the MSC software. Each sample specific fastq file was analysed in the following manner:
  • CEG27_75_1 to CEG27_75_8 Key alignment metrics from two successful exemplifications of the method, namely CEG27_75_1 to CEG27_75_8 are shown in Table 6. Visualisation of the binary alignment files was performed using either SeqMonk analysis tool (www.bioinformatics.babraham.ac.uk projects/seqmonk) or IGV genome browser (www.broadinstitute.org/igv).
  • FIGS. 2 and 3 are IVG screenshots showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions.
  • the bottom track shown in blue indicates the expected loci primer annealing positions.
  • Example 3 Use of pre-amplification to increase the amount of available material. Probe design and plex:
  • the panel of probes used is a high complexity pool, typically between 12,000 and 440,000 plex.
  • Each probe in the pool is a single stranded 86mer, composed of a 36nt "5 ' tail” of universal sequence compatible with the Illumina NGS sequencing technology platform and a 50nt "3 ' head” designed to complement target regions in the human genome (designed to hg38 build).
  • the head region of each probe was designed to be the identity of the bisulfite converted original sequence.
  • Promega human male gDNA (catalogue number: G141).
  • This protocol describes the process of performing targeted enrichment using the CEGX TrueMethyl OnTarget technology using commercially available reagents as labeled according to the CEGX commercial kits.
  • CEGX recommends that steps 7, 8 and 9 (incubations and purifications) of the Library Module protocol are to be done in 1.5 mL tubes to minimise sample loss.
  • steps 7, 8 and 9 are to be done in heat blocks capable of holding 1.5 mL tubes.
  • IMPORTANT Prepare a FRESH stock of 70% Ethanol for the experiment as follows:
  • Ethanol combine 1.4 mL 100% of Ethanol and 0.6 mL Ultra Pure
  • IMPORTANT Buffer 1 and Enzyme A should not be master mixed prior to use.
  • Step 7.1 Thaw Buffer 1 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
  • Step 7.2 Add 2 ⁇ of Buffer 1 to the 11 ⁇ . of DNA (from Step 6.16) recovered from the Conversion Module, mix by pipetting.
  • Step 7.3 Add 2 ⁇ ⁇ of Enzyme A to the DNA (from Step 7.2.) and mix by pipetting. Step 7.4. Incubate DNA at 37 °C for 20 min.
  • Step 7.5 Heat denature the reaction mix at 95 °C for 3 min.
  • Step 7.6 Cool DNA immediately on ice for 5 min.
  • IMPORTANT The Adaptor 3 aliquot is intended as a single use only. Repeated freeze/thaw cycles should be avoided. Adaptor 3, Adaptor 1 Additive and Enzyme B should not be master mixed prior to use.
  • Step 7.7 Thaw Adaptor 3 and Adaptor 1 Additive on ice.
  • Step 7.8 To the 15 ⁇ . of DNA (from Step 7.6.), add 2 ⁇ , of Adaptor 3, and 1 ⁇ , of
  • Step 7.9. Add 2 ⁇ ⁇ of Enzyme B and mix by pipetting.
  • Step 7.10 Incubate at 37 °C for 30 min.
  • Step 7.1 To the 20 ⁇ , of DNA (from Step 7.10.), add 2 ⁇ . of the Stop Solution.
  • Step 7.12. Add 66 ⁇ _, of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 7.13 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 7.14 Carefully remove supernatant and wash the beads twice with 200 ⁇ ⁇ of 70%
  • Step 7.15 Resuspend beads in 23.5 ⁇ ⁇ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
  • Step 7.16 Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 22 ⁇ ⁇ of DNA in a new 1.5 rtiL tube.
  • the purified DNA can be stored at -20°C overnig
  • IMPORTANT Enzyme A and Enzyme C may be master mixed before addition.
  • Step 8.1 Thaw Buffer 2 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
  • Step 8.2. Add 6 ⁇ L ⁇ of the Buffer 2 to the 22 ⁇ . of DNA (from Step 7.16 ), mix by pipetting.
  • Step 8.3. Add 1 ⁇ ⁇ of Enzyme A and 1 ⁇ ⁇ of Enzyme C to the buffered DNA (from
  • Step 8.4 Incubate DNA at 37 °C for 30 min.
  • Step 8b Bead Purification
  • Step 8.5 To the 30 ⁇ , of DNA (from Step 8.4.), add 3 ⁇ , of the Stop Solution.
  • Step 8.6 Add 66 ⁇ . of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 8.7 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 8.8 Carefully remove supernatant and wash the beads twice with 200 ⁇ . of 70%
  • Step 8.9. Resuspend beads in 19.5 of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
  • Step 8.10 Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 18 ⁇ ⁇ of DNA in a new 1.5 mL tube.
  • the purified DNA can be stored at -20°C overnig
  • IMPORTANT Buffer 3 and Adaptor 4 may be master mixed before addition.
  • Step 9b Bead Purification
  • Step 9.5 To the 45 ⁇ , of DNA (from Step 9.4.), add 5 ⁇ , of the Stop Solution.
  • Step 9.6 Add 50 ⁇ ⁇ of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 9.7 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 9.8 Carefully remove supernatant and wash the beads twice with 200 ⁇ . of 70%
  • Step 9.9. Resuspend beads in 20.0 ⁇ ⁇ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
  • Step 9.10 Precipitate beads using a magnetic rack for 2 min at room temperature, and transfer 18.75 ⁇ . of DNA to a new 0.2 mL PCR tube.
  • the purified DNA can be stored at -20°C overnight.
  • Step 10.1 Thaw Buffer 4 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
  • Step 10.2. Add 5 ⁇ . of the Buffer 4 to the 18.75 ⁇ L ⁇ of DNA (from Step 9.10.) and mix by pipetting.
  • Step 10.3. Add 1.25 ⁇ of Enzyme E to the buffered DNA (from Step 10.2.) and mix by pipetting
  • Step 10.4 Incubate DNA at 37 °C for 30 min.
  • Step 10.5. Heat denature Enzyme E for 95 °C for 5 min.
  • Step 10.6 Cool DNA to 4 °C for 5 min.
  • Step 10b Sample Amplification
  • Step 10.7 Thaw the OT PCR Mix on ice.
  • Step 10.8 Add 37 ⁇ of the OT PCR Mix to the 12 ⁇ of DNA (from Step 10.6 ).
  • Step 10.9. Add 1 ⁇ . of Enzyme F to the buffered DNA (from Step 10.8.).
  • Step 10.1 To the 50 ⁇ , of DNA (from Step 10.10.), add 5 ⁇ , of Stop Solution and 49.5 of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 10.13 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 10.14 Carefully remove supernatant and wash the beads twice with 200 ⁇ ⁇ of 70%
  • Step 10.15 Resuspend the beads in 10 ⁇ ⁇ of Ultra Pure Water and incubate for 2 min at room temperature.
  • Step 10.16 Centrifuge briefly and precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 ⁇ ⁇ of DNA into a new tube.
  • Step 11.1 Pre-warm Hybridisation Buffer (HB) to 50 °C in a thermal block.
  • Step 11.2 Mix 400 ng DNA (from Step 10.16.) with Ultra Pure Water to make up 7
  • Step 11.3. Add pre-warmed HB to the DNA/HA mix, mix by pipetting, and hold at 50 °C for 5 min.
  • Step 11.4 Aliquots 2.5 ⁇ _, of probe to individual tube and pre-warm to 50 °C (set heated lid temperature at 70 °C if possible).
  • Step 11.5. Transfer the mix from Step 11.3. to the pre-warmed probe tube (Step 11.4.), mix by pipetting.
  • Step 11.5. Incubate the hybridization reactions at 50 °C for approximately 64h in a PCR block.
  • Step 1 lb Hybridisation clean-up bead pre-wash
  • Step 11.6 Use 15 ⁇ M-280 Streptavidin Beads (ThermoFisher, 11205D; user supplied) per reaction, in fresh 1.5 mL tubes.
  • Step 11.7 Wash M-280 Streptavidin Beads three times with 500 ⁇ , of lx BW buffer
  • Step 1 Resuspend the beads in 35 ⁇ . of 1 2x BW buffer.
  • Step 1 1.9. Transfer 25 ⁇ ⁇ of the hybridization mixture to the beads (Total volume 60 ⁇ ) and incubate at room temperature for 30 minutes in a thermal mixer (at 1000 rpm).
  • Step 1 1.10. Perform 2 washes with 200 of Wash Buffer 1 (WB 1) at room temperature,
  • Step 1 1.1 1. Perform 3 washes with 200 ⁇ L ⁇ of Wash Buffer 2 (WB2) at 50 °C, 5 min each time in a thermal mixer (at 1000 rpm). Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant.
  • WB2 Wash Buffer 2
  • Step 1 1.12. Perform final wash with 50 ⁇ of Wash Buffer 3 (WB3) at room temperature.
  • WB3 Wash Buffer 3
  • Step 1 1.13. Remove the supernatant and resuspend the beads in 10.5 ⁇ ⁇ Ultra Pure Water.
  • Step l i d Probe extension (Finishing Targeting)
  • Step 1 1.14. To the purified DNA on beads add 1 % ⁇ , Buffer 5 and 1.5 ⁇ ⁇ Enzyme C. Step 1 1.15. Mix reactions gently and incubate for 30 minutes at 37 °C.
  • Step 1 1.16 Collect the beads from Step 1 1.15. on a magnetic rack and remove the supernatant.
  • Step 1 1.17. Wash the beads twice with lx BWT buffer.
  • Step 1 1.18. Resuspend the beads in 30 ⁇ ⁇ of BW4 and incubate for 10 minutes at room temperature in a thermal mixer. Step 1 1.19. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant.
  • Step 1 1.20. Wash the beads twice with 200 ⁇ , ⁇ ⁇ BWT buffer, then once with 50 ⁇ , of
  • Step 1 1.21. Resuspend the beads in 25 of Ultra Pure Water, and use 12.5 for the next step (store the other 12.5 at 4 °C).
  • Step 12a Sample Indexing
  • Step 12.1. Thaw the OT Index PCR Mix on ice (use index of choice, Appendix E).
  • Step 12.2. Transfer 12.5 ⁇ of DNA (from Step 11.21.) to fresh 0.2 mL PCR tube add 12 ⁇ , of the OT Index PCR Mix.
  • Step 12.3. Add 0.5 ⁇ of Enzyme F to the buffered DNA (from Step 12.2.).
  • Step 12b Bead Purification
  • Procedure Step 12.5. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min at room temperature.
  • Step 12.6 Carefully remove the supernatant from step 12.5. (this now contains the
  • Step 12.7 To the 25 ⁇ L ⁇ of DNA (from Step 12.6.), add 22.5 ⁇ L ⁇ of the Magnetic Bead
  • Binding Solution 3 vortex to mix and incubate for 15 min at room temperature.
  • Step 12.8 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 12.9. Carefully remove supernatant and wash the beads twice with 200 ⁇ , of 70%
  • Step 12.10 Resuspend the beads in 10 ⁇ _, of Ultra Pure Water and incubate for 2 min at room temperature.
  • Step 12.1 Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 ⁇ . of DNA into a new tube. The DNA is now ready for Illumina NGS sequencing.
  • CEGX recommends that the indexed libraries are kept at -20 °C for long term storage
  • the library so prepared can be sequenced.
  • a summary of the library preparation conditions is shown in Table 7
  • the panel of probes used is a high complexity pool, typically between 12,000 and 440,000 plex.
  • Each probe in the pool is a single stranded 86mer, composed of a 36nt "5 ' tail” of universal sequence compatible with the Illumina NGS sequencing technology platform and a 50nt "3 ' head” designed to complement target regions in the human genome (designed to hg38 build).
  • the head region of each probe was designed to be the identity of the bisulfite converted original sequence.
  • Promega human male gDNA (catalogue number: G141).
  • This protocol describes the process of performing targeted enrichment using the CEGX TrueMethyl OnTarget technology using commercially available reagents as labeled according to the CEGX commercial kits.
  • CEGX recommends that steps 7, 8 and 9 (incubations and purifications) of the Library Module protocol are to be done in 1.5 mL tubes to minimise sample loss.
  • steps 7, 8 and 9 are to be done in heat blocks capable of holding 1.5 mL tubes.
  • IMPORTANT Prepare a FRESH stock of 70% Ethanol for the experiment as follows:
  • Ethanol combine 1.4 mL 100% of Ethanol and 0.6 mL Ultra Pure
  • IMPORTANT Buffer 1 and Enzyme A should not be master mixed prior to use
  • Step 7.1 Thaw Buffer 1 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
  • Step 7.2 Add 2 ⁇ , of Buffer 1 to the 11 ⁇ , of DNA (from Step 6.16) recovered from the Conversion Module, mix by pipetting.
  • Step 7.3 Add 2 of Enzyme A to the DNA (from Step 7.2.) and mix by pipetting. Step 7.4. Incubate DNA at 37 °C for 20 min.
  • Step 7.5 Heat denature the reaction mix at 95 °C for 3 min.
  • Step 7.6 Cool DNA immediately on ice for 5 min.
  • IMPORTANT The Adaptor 3 aliquot is intended as a single use only. Repeated freeze/thaw cycles should be avoided. Adaptor 3, Adaptor 1 Additive and Enzyme B should not be master mixed prior to use.
  • Step 7.7 Thaw Adaptor 3 and Adaptor 1 Additive on ice.
  • Step 7.8 To the 15 ⁇ of DNA (from Step 7.6.), add 2 ⁇ . of Adaptor 3, and 1 ⁇ . of
  • Step 7.9. Add 2 ⁇ ⁇ of Enzyme B and mix by pipetting.
  • Step 7.10 Incubate at 37 °C for 30 min.
  • Step 7.11 To the 20 ⁇ , of DNA (from Step 7.10.), add 2 ⁇ , of the Stop Solution.
  • Step 7.12. Add 66 ⁇ . of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 7.13 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 7.14 Carefully remove supernatant and wash the beads twice with 200 ⁇ of 70%
  • Step 7.15 Resuspend beads in 23.5 ⁇ ⁇ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
  • Step 7.16 Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 22 ⁇ ⁇ of DNA in a new 1.5 mL tube.
  • the purified DNA can be stored at -20°C overnight.
  • IMPORTANT Enzyme A and Enzyme C may be master mixed before addition.
  • Step 8.1 Thaw Buffer 2 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
  • Step 8.2 Add 6 ⁇ , of the Buffer 2 to the 22 ⁇ , of DNA (from Step 7.16 ), mix by pipetting.
  • Step 8.3. Add 1 ⁇ L of Enzyme A and 1 ⁇ L of Enzyme C to the buffered DNA (from
  • Step 8.4 Incubate DNA at 37 °C for 30 min.
  • Step 8b Bead Purification
  • Step 8.9. Resuspend beads in 19.5 ⁇ ⁇ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
  • Step 8.10 Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 18 ⁇ L of DNA in a new 1.5 rtiL tube.
  • the purified DNA can be stored at -20°C overnig
  • IMPORTANT Buffer 3 and Adaptor 4 may be master mixed before addition.
  • Step 9.1 Thaw Buffer 3 and Adaptor 4 on ice and equilibrate the Magnetic Bead
  • Step 9.2 Add 22.5 ⁇ ⁇ of the Buffer 3 and 3.5 ⁇ ⁇ of Adaptor 4 master mix to the 18 ⁇ ⁇ of DNA (from Step 8.10.), mix by pipetting.
  • Step 9.3 Add 1 ⁇ ⁇ of Enzyme D to the buffered DNA (from Step 9.2) and mix by pipetting.
  • Step 9.4 Incubate DNA at 25 °C for 15 min.
  • Step 9b Bead Purification
  • Step 9.5 To the 45 ⁇ . of DNA (from Step 9.4.), add 5 ⁇ of the Stop Solution.
  • Step 9.6 Add 50 ⁇ . of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 9.7 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 9.8 Carefully remove supernatant and wash the beads twice with 200 ⁇ . of 70%
  • Step 9.9. Resuspend beads in 20.0 of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
  • Step 9.10 Precipitate beads using a magnetic rack for 2 min at room temperature, and transfer 18.75 ⁇ . of DNA to a new 0.2 mL PCR tube.
  • the purified DNA can be stored at -20°C overnight.
  • IMPORTANT Buffer 4 and Enzyme E should not be master mixed prior to use.
  • Step 10.1 Thaw Buffer 4 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
  • Step 10.2. Add 5 ⁇ . of the Buffer 4 to the 18.75 ⁇ of DNA (from Step 9.10.) and mix by pipetting.
  • Step 10.3. Add 1.25 ⁇ . of Enzyme E to the buffered DNA (from Step 10.2.) and mix by pipetting
  • Step 10.4 Incubate DNA at 37 °C for 30 min.
  • Step 10.5. Heat denature Enzyme E for 95 °C for 5 min.
  • Step 10.6. Cool DNA to 4 °C for 5 min.
  • Step 10b Sample Amplification
  • Step 10.7 Thaw the OT PCR Mix on ice.
  • Step 10.8 Add 37 ⁇ of the OT PCR Mix to the 12
  • Step 10.9. Add 1 iL of Enzyme F to the buffered DNA (from Step 10.8.).
  • Step 10.11 To the 50 ⁇ . of DNA (from Step 10.10.), add 5 ⁇ . of Stop Solution and 44 ⁇ ⁇ of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 10.13 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 10.14 Carefully remove supernatant and wash the beads twice with 200 ⁇ ⁇ of 70%
  • Step 10.15 Resuspend the beads in 10 ⁇ ⁇ of Ultra Pure Water and incubate for 2 min at room temperature.
  • Step 10.16 Centrifuge briefly and precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 ⁇ . of DNA into a new tube.
  • the purified DNA can be stored at -20°C overnight.
  • Step 1 1.1. Pre-warm Hybridisation Buffer (HB) to 50 °C in a thermal block.
  • HB Hybridisation Buffer
  • Step 1 1.3. Add 2 of Hybridisation Additive (HA) and 12.5 ⁇ ⁇ of Hybridization Buffer to 8 ⁇ _. DNA (from Step 11.2), denature at 95 °C for 3 minutes and then hold at 50 °C in a thermal cycler.
  • HA Hybridisation Additive
  • Step 11.2 DNA (from Step 11.2), denature at 95 °C for 3 minutes and then hold at 50 °C in a thermal cycler.
  • Step 1 1.4. Aliquots 2.5 ⁇ ⁇ of probe to individual tube and pre-warm to 50 °C (set heated lid temperature at 70 °C if at all possible).
  • Step 1 1.5 Transfer the mix from Step 1 1.3. to the pre-warmed probe tube (Step 11.4.), mix by pipetting.
  • Step 1 1.5 Incubate the hybridization reactions at 50 °C for approximately 64h in a PCR block.
  • Step 1 lb Hybridisation clean-up bead pre-wash
  • Step 1 1.6. Use 15 ⁇ M-280 Streptavidin Beads (ThermoFisher, 1 1205D; user supplied) per reaction, in fresh 1.5 mL tubes.
  • Step 1 1.7. Wash M-280 Streptavidin Beads three times with 500 ⁇ , of lx BW buffer
  • Step 1 Resuspend the beads in 35 ⁇ _, of 1 2x BW buffer.
  • Step 1 1.9. Transfer 25 ⁇ ⁇ of the hybridization mixture to the beads (Total volume 60 ⁇ .) and incubate at room temperature for 30 minutes in a thermal mixer (at 1000 rpm).
  • Step 1 1.10. Perform 2 washes with 200 ⁇ of Wash Buffer 1 (WB 1) at room temperature,
  • Step 1 1.1 1. Perform 3 washes with 200 ⁇ , of Wash Buffer 2 (WB2) at 50 °C, 5 min each time in a thermal mixer (at 1000 rpm). Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant. Step 1 1.12. Perform final wash with 50 ⁇ ⁇ of Wash Buffer 3 (WB3) at room temperature, remove the supernatant.
  • WB2 Wash Buffer 2
  • Step 1 1.12. Perform final wash with 50 ⁇ ⁇ of Wash Buffer 3 (WB3) at room temperature, remove the supernatant.
  • Step l id Probe extension (Finishing Targeting)
  • Step 1 1.14. Add 22.5 ⁇ Extension Buffer, 6 ⁇ , 5xExtension Additive and 1.5 ⁇
  • Enzyme C to the beads containing targeted DNA from step 11.12.
  • Step 1 1.15. Mix reactions gently and incubate for 30 minutes at 37 °C in a thermomixer at
  • Step 1 1.16 Collect the beads from Step 1 1.15. on a magnetic rack and remove the supernatant.
  • Step 1 1.17. Wash the beads twice with lx BWT buffer.
  • Step 1 1.18. Resuspend the beads in 30 ⁇ ⁇ of BW4 and incubate for 10 minutes at room temperature in a thermomixer at 1300 rpm.
  • Step 1 1.19. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant.
  • Step 1 1.20. Wash the beads twice with 200 ⁇ of lx BWT buffer, then once with 50 ⁇ , of
  • Step 1 1.21. Resuspend the beads in 25 ⁇ ⁇ of Resuspension Buffer and proceed to sample indexing.
  • STEP 12 SAMPLE INDEXING
  • Step 12a Sample Indexing
  • Step 12.1. Thaw the Index PCR Mix on ice (use index of choice, Appendix E).
  • Step 12.2. Transfer 25 ⁇ , of DNA (from Step 11.21.) to fresh 0.2 mL PCR tube
  • Step 12.3. Add 1 ⁇ . of Enzyme F to the buffered DNA (from Step 12.2.).
  • Step 12b Bead Purification
  • Step 12.5. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min at room temperature.
  • Step 12.6 Carefully transfer the supernatant from step 12.5. to a fresh 1.5 ml tube (this now contains the DNA).
  • Step 12.7 To the 50 ⁇ . of DNA (from Step 12.6.), add 5 ⁇ . of the Stop Solution and 44 ⁇ ⁇ of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
  • Step 12.8 Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
  • Step 12.9. Carefully remove supernatant and wash the beads twice with 200 ⁇ ⁇ of 70%
  • Step 12.10 Resuspend the beads in 10 ⁇ , of Ultra Pure Water and incubate for 2 min at room temperature.
  • Step 12.1 Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 ⁇ . of DNA into a new tube. The DNA is now ready for Illumina NGS sequencing.
  • CEGX recommends that the indexed libraries are kept at -20 °C for long term storage
  • the library so prepared can be sequenced.
  • a summary of the library preparation conditions is shown in Table 9.

Abstract

This invention relates to the preparation of nucleic acid samples for analysis and certain methods and tools for the selection of specific regions of interest from a nucleic acid sample. The method is particular advantageous for multiplexed amplification reactions, whereby one fixed primer can be used. The method can be used in the preparation of enriched libraries from single stranded, highly fragmented samples.

Description

Nucleic Acid Sample Enrichment
This invention relates to the preparation of nucleic acid samples for analysis and certain methods and tools for the selection of specific regions of interest from a nucleic acid sample.
Many methods exist for the preparation of samples of double-stranded DNA, for example for sequencing (e.g. Illumina TruSeq and NextEra, 454, NEBnext, Life Technologies etc).
However, the preparation of single- stranded DNA samples or RNA samples is more challenging because single stranded DNA or RNA molecules cannot be efficiently ligated together enzymatically. Reported workflows for the preparation of single- stranded DNA rely on the use of primers with degenerate sequences that "randomly prime" the single-stranded DNA and allow a truncated version of the parent DNA molecule to be adapted (for example, Epigenome™ Methyl-Seq kit, Epicentre Technologies WI USA). Methods using RNA ligase or CircLigase to join ends of single stranded DNA together have been reported but suffer from poor efficiency or are limited to the size of DNA fragments that can be ligated together.
Single stranded sample preparation is commonly required following bisulfite conversion of DNA molecules. The bisulfite conversion process necessarily results in the formation of single stranded DNA, and therefore involves either i) pre-bisulfite sample preparation or ii) post-bisulfite sample preparation employing random priming for downstream analysis. Drawbacks to these methods include the potential to generate nicked or fragmented libraries incapable of subsequent amplification, the loss of sequence information from the parent DNA molecules, generation of artefacts that contaminate the sample of interest or induce significant representation bias of reads in the final dataset. A direct method of ligating the termini of single stranded DNA post-bisulfite treatment in quantitative yield is of significant interest.
Rather than analyse the entire nuclei acid content of a sample, it is often of interest to reduce the complexity of the sample by targeting certain regions. Targeted enrichment is a method used commonly in genomic and epigenomic analysis to reduce the complexity of the genome being studied and to home-in on specific regions of interest (e.g. exomes, CpGs, specific genes etc). This allows the cost of sequencing to be decreased dramatically and the complexity of analysis simplified. Methods exist in the art that utilise (including, but not limited to) PCR, multiplex PCR, hybridisation arrays and in solution hybridisation capture to achieve targeted enrichment. Targeted enrichment has been shown to be an effective and reliable alternative method to whole genome analysis in situations where only a fraction of the genome needs to be interrogated or where experiments are done at such scale (numbers of samples) that to do sequencing analysis at a whole genome scale becomes cost prohibitive.
Conventional targeted enrichment methods are limited in certain ways, depending on the method. All prior art enrichment methods for genomic DNA samples operate on double stranded samples. Bisulfite treatment results in single stranded nucleic acid samples. It is of interest to develop an improved targeted enrichment method which works on single stranded samples, such as those resulting from treating with bisulfite.
If using PCR, pairs of primers for each amplicon, or locus, are required and this can limit the complexity (number of locus amplicons targeted) of a single enrichment event; also if using primers with a 5'-flap non-complementary to the loci but complementary to the sequencing platform intended for analysis, this can simplify the workflow but decrease primer specificity, increase cost of the targeting primers and decrease the complexity of targeting. If using hybridisation arrays or in-solution capture methods, typically these target adapted fragments prepared from double stranded (ds) DNA and as such limit their usefulness to native DNA applications. Methods do exist that allow targeted enrichment of bisulfite converted DNA (e.g. Agilent Methyl-Seq and Roche Nimblegen SeqCapEpi product lines) but both of these require the pre-conversion adaptation of dsDNA and PCR amplification which results in decreased library diversity, bias and a sub-optimal quality result.
Multiplex PCR is limited to ca. 6000-plex per reaction, limiting the multiplexing capacity. Using too many primer pairs introduces the potential for non-specific hybridisation leading to off target enrichment. Sophisticated variations of mutiplex PCR exist (Padlock probes, Agilent Haloplex etc) that address this point, but still only make modest improvements in specificity (<100-000-plex). In-solution target capture (hybridisation pull-down) methods utilise 200-300,000 probes and work robustly for target isolation. However as bisulfite converts C to U, the top and bottom strands of bisulfite converted DNA are no longer complementary, meaning twice as many probes are required in order to compare bisulfite and non-bisulfite treated samples. This means that in order to get methylation information from the same region from both the converted top and converted bottom strands, hybridisation methods need two pairs of primers to get full information from the same region.
Methods described herein can utilise a single primer per loci (not a pair) which should allow improvements in specificity and enable more loci to be targeted within a single reaction. Only 2 primers per region will give information on the top and bottom converted strand of a selected region.
Furthermore, hybridisation pull-down methods are inefficient. Working with low mass samples can be a problem simply due to liquid handling losses and multiple transfer steps. In order to analyses a small subset of fragments (e g the 1-2 % of the genome represented by the exome), the vast majority of the sample remains in solution, so the amount of material required as the input sample is high as most of the sample is not captured. Elution of captured DNA from beads is also inefficient. Methods described herein can be used as a one-pot reaction that should offset these issues and will not be reliant on bead purifications and pulldowns.
Assays requiring probe arrays have limited flexibility regarding the choice of sample, or the choices of locus to be studied. The Illumina™ Human Methylation 450K Array is an example of a targeted enrichment array format that is well utilised by the epigenetics research community, but has several drawbacks. It is limited to a physical array format which yields non-digital data and makes sample multiplexing at high number cumbersome and dependent on sophisticated automation. The method is only compatible with Human samples, limiting its usefulness. The number of probes and their specificity is high (-480,000) but the method relies on whole genome amplification using random priming and as such is potentially biased.
Methods described herein can be sequencing based and so would yield digital data, are agnostic to species (target primer pools could be designed to any species) and there is less inertia required to prepare multiple different targeting pools than is the case for array-based techniques. Sample multiplexing can be significantly simpler, and uses approaches common in the art such as molecular barcodes (tags) attached to identify different populations of primers. PCR is inherently biased, particularly when amplifying bisulfite converted (AT-rich) DNA. Data generated by methods dependant on PCR will necessarily be of lower quality and more biased than those which no not employ PCR. Methods described herein method would work with or without PCR, so not only can the effect of PCR be evaluated, it can be eliminated as a source of doubt in the targeting experiments.
Amplification of regions of greater than around 400bp from bisulfite converted DNA is difficult due to the fragmentation induced by the bisulfite conversion process. This fragmentation is random, but decreases the molarity of fragments in the pool that remain intact in the region of interest. Higher concentration of template is typically required in the amplicon reactions in order to account for this, which decreases the specificity of targeting and increases the sample mass burden required per amplicon reaction (which can be a problem for precious samples). The same is true for methods that depend on targeting a converted adapted NGS library. The conversion process fragments the library, decreasing the molarity of intact fragments (DNA inserts flanked by two universal primer regions) which decreases the diversity (number of unique fragments) in a given sample. This is a potential source of bias when such samples are subjected to PCR as this can lead to the generation of duplicates and the amplification of a low-diversity sub-set of starting fragments. This would have an adverse effect on modification quantitation. Methods described herein solve these problems as the fragmented samples are used directly in the hybridisation, and the adaptors are attached after the fragmentation step.
Summary of the invention
Methods described herein allows for the targeted enrichment of converted, single stranded fragments derived directly from genomic DNA, RNA or alternative samples via amplification or selection of mono-adapted fragments. The method relies on the attachment of an adaptor to one end of each of the fragments to make a mono-adapted sample library, which can be targeted using locus specific primers. Hybridisation of locus specific primers with the fragmented sample and extension of the primers hybridised to the sample results in selection of a subset of the monoadapted fragments, which can then be amplified using the locus specific primers and a primer having the same sequence as part or all of the adaptor. Only the fragments arising from the successful hybridisation and extension of the loci-specific primers are then analysed further, improving the specificity of targeting and simplifying the complexity of the targeting selection and isolation. Further benefits include an increase in the number of loci specific probe oligos that can be used in a single targeting pool compared to alternative methods, allowing more features to be targeted, generating a richer dataset with improved resolution and quality. Methods described herein will yield a significant competitive advantage in data quality and cost per data point in the current epigenomic marketplace. The method can work robustly with a large number of loci targeting probes, and results in a high level of selection. The method improves the recovery of samples as the fragmentation is carried out before attachment of the adaptor. Attaching adaptors before fragmentation causes the majority of fragments to lose the adaptor. The method described herein are highly efficient in accurately selecting the desired regions from a low amount of input material, and are therefore advantageous over prior art methods involving the targeted selection of single stranded samples.
Disclosed is a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
a. fragmenting the sample to produce a population of single stranded sample fragments; b. attaching an oligonucleotide adaptor to the 3 ' end of each of the strands in the sample wherein the adaptor is a hairpin containing a cleavage site;
c. extending the 3' end of the hairpin to make the fragments double stranded;
d. cleaving the hairpin to produce single stranded copies of the fragments having an adaptor fragment at one end, where the adaptor fragment is a portion of the hairpin;
e. hybridising a population of locus specific primers to a subset of the fragment copies; f. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the adaptor fragment;
g. hybridising a single primer to the copy of the adaptor fragment and extending each of the extension products using the single primer, where steps f and g occur simultaneously or sequentially; and
h. repeating steps f and g one or more times.
Disclosed is a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
a. taking a population of single stranded nucleic acid molecules;
b. attaching an oligonucleotide adaptor to the 3' end of each of the strands in the sample; c. hybridising a population of locus specific primers to a subset of the fragments or a copy thereof;
d. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the adaptor; e. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps d and e occur simultaneously or sequentially; and
f. repeating steps d and e one or more times.
The single stranded sample fragments have an adaptor attached to only one end prior to hybridising the population of locus specific primers. The samples are fragmented prior to attachment of the adaptors, or the sample originates as a single stranded population of nucleic acids such as RNA or a degraded samples such as FFPE samples or cell free nucleic acid fragments.
Without a step of amplification prior to hybridisation and enrichment, the amount of material available for enrichment can be too limited. Therefore also included herein is an optional amplification step prior to the hybridisation of the locus primers. Included herein is a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
a. taking a population of single stranded nucleic acid molecules,
b. attaching a first oligonucleotide adaptor to the 3' end of each of the strands in the sample,
c. making the population of single stranded molecules double stranded,
d. attaching a second adaptor to the double stranded molecules,
e. amplifying the double stranded molecules,
f. denaturing the double stranded molecules,
g. hybridising a population of locus specific primers to a subset of the denatured molecules,
h. extending the hybridised locus specific primers to produce extension products where each extension product includes a copy of the first adaptor; and
i. amplifying the extension products. Disclosed herein is a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
a. taking a first population of single stranded nucleic acid molecules;
b. attaching a first oligonucleotide adaptor to the 3' end of each of the strands in the sample;
c. making the first population of single stranded molecules double stranded;
d. either,
i) attaching a second adaptor to the double stranded molecules, amplifying the double stranded molecules and denaturing the double stranded molecules to produce a second population of single stranded molecules; or
ii) cleaving the first adaptor and removing the first population of single stranded nucleic acid molecules to produce a second population of single stranded molecules;
e. hybridising a population of locus specific primers to a subset of the second population of single stranded molecules;
f. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the first adaptor or a fragment thereof;
g. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps e and g occur simultaneously or sequentially; and
h. repeating steps e f and g one or more times.
The sample can originate as a single stranded sample, or the sample can be fragmented. Single stranded samples may include RNA, including mRNA, siRNA or micro RNA. Alternatively the sample may be fully or partially in single stranded form due to degredation or due to the harsh processes used in fixing or unfixing the samples. The method works well on samples such as FFPE degraded samples, or on samples of ancient DNA such as neanderthal DNA.
Fragmentation of the sample can be caused by bisulfite treatment. Bisulfite treatment converts non-methylated cytosine bases to uracil bases, thereby reducing the number of C bases in the sample. Where the sample is bisulfite treated, the locus specific primers can contain a higher level of the three nucleic acid bases A, C and T such that the locus is complementary to a sample treated with bisulfite (i.e. the primers can contain little or no G). Alternatively the sample can be copied initially and therefore the locus specific primers can contain the three nucleic acid bases A, G and T such that the locus is equivalent to a sample treated with bisulfite (i.e. the primers can contain little or no C). In order to identify sites of cytosine methylation, the locus can contain all four bases. The population of locus specific primers can contain both primers with A, C and T and primers with A, C, G and T. Primers having both G and T in equivalent positions can be used to identify specific methylation sites if required. The population of locus specific primers can contain both primers with A, G and T and primers with A, C, G and T. Primers having both C and A in equivalent positions can be used to identify specific methylation sites if required. Alternatively the methylation sites of interest may be in the extension region adjacent to the 3 ' end of the primers.
There is no particular limit to the number of loci analysed. The method can work robustly with millions of loci targeting probes. The population of locus specific primers can be at least 50 different sequences. The population of locus specific primers can be at least 100 different sequences. The population of locus specific primers can be at least 1000 different sequences. The population of locus specific primers can be at least 10,000 different sequences. The population of locus specific primers can be at least 100,000 different sequences. The population of locus specific primers can be at least 1,000,000 different sequences.
The population of primers can contain a universal region common to all primers. The universal region is preferably not complementary to the sample of interest. The universal region can be located at the 5' end of the primer. The non-sample complementary region may include an identifier region or tag to enable sample multiplexing.
Any method of attaching an oligonucleotide adaptor can be used. The method should be capable of the template independent joining of two single stranded oligonucleotides. The method described herein can add a single stranded oligonucleotide with a 3 ' hydroxyl to a single stranded oligonucleotide with a 5 '-triphosphate moiety. In one embodiment, the adaptor can be a hairpin, optionally having a cleavage site.
The method of attachment of an oligonucleotide adaptor should add a single oligonucleotide adaptor to the fragment. Polymerisation of the oligonucleotide adaptor is undesireable for subsequent analysis of the samples. Thus the adaptor should not carry an extendable 3' hydroxyl group, thereby ensuring a single adaptor is added. Methods of attaching adaptors include the use of ligases or polymerases. The method may involve the use of splints in order to turn the ends of the single stranded fragments partially double stranded. The splint and/or adaptor may be DNA, RNA or a mixture thereof.
Where the adaptor is a hairpin, a first extension reaction extends the 3 ' end of the hairpin adaptor to produce a copy of the fragment. The single stranded fragments having a hairpin attached thereto are thus made double stranded.
The hairpin may contain a specific cleavage location, cleavage of which allows the strands to be separated, thus producing mono-adapted single strands.
An optional amplification can be carried out using a second adaptor prior to the selection using the locus specific primers. Such a method with and without the optional amplification is shown in figure 4. The cleavage or amplification are two alternative ways of arriving at a single stranded sample for selection. The single stranded sample for selection can be a copy of the first single stranded sample absent from the first single strands, or can contain the denatured first single stranded sample and strands which are copy thereof.
A second extension, or amplification, can be carried out using locus specific primers, thus selecting and optionally amplifying, the desired fragments, each fragment already being mono-adapted or bi-adapted.
Where the adaptor is single stranded, an extension, or amplification, can be carried out using locus specific primers, thus selecting and optionally amplifying, the desired fragments, each fragment already being mono-adapted or bi-adapted.
The extended locus specific primers can be copied, and therefore amplified, using a 'fixed' primer common to all fragments, having a sequence in common with a portion of the hairpin adaptor.
The extension reactions may be carried out using a nucleic acid polymerase. The extension reactions may be carried out using nucleotide triphosphates. The extensions can be carried out using four nucleotide triphosphates. The hybridisation and extension cycles can be repeated. The primer hybridisation can be carried out in a single cycle, or the locus specific hybridisation and subsequent extension can be repeated, for example using thermocycling before the fixed anchor primer is then added to initiate exponential amplification.
The extended locus hybridised strand should terminate in a 3 '-hydroxyl group. Thus the extended primers have a newly generated 3 ' end, the 3 ' hydroxyl group resulting from an incorporated nucleotide (from the dNTP). The 3 ' hydroxyl group in the extended strand may terminate at the cleavage point of the hairpin adaptor. The 3 ' end of the extended strand is thus complementary to a portion of the hairpin adaptor. The 3 ' end of the extended strand may be at the 5' end of the single stranded adaptor.
The adaptor can be attached using a ligase. Alternatively, the adaptor can be attached using a polymerase. The polymerase may be a template independent polymerase. The template independent polymerase can be terminal transferase (TdT). The template independent polymerase can be polyadenylate polymerase (PAP).
In order to be attached using a polymerase, the adaptor can contain a triphosphate moiety. The adaptor can contain a region of known sequence, allowing amplification via hybridisation to a portion of the adaptor, or a copy thereof. The monoadapted templates can be amplified using a fixed primer at one end, and the locus specific primers at the other end. Thus rather than requiring 2n primers to amplify n loci, the number of primers required is n +1. For example 10,000 loci can be amplified using 10,001 primers rather than 20,000 primers.
The first or second extension and or amplification reaction may be carried out using a nucleic acid polymerase. The extension reaction may be carried out using nucleotide triphosphates. The extension can be carried out using four nucleotide triphosphates. The four nucleoside triphosphates may be dCTP, dATP, dGTP and dTTP. The use of dTTP allows strands made using dUTP in the first extension to be cleaved, leaving solely the second extension strands intact. The extension strands may be treated to remove the uracil bases, thereby allowing selective strand cleavage. The extension strands may be treated with the enzyme UDG, resulting in abasic sites, which can then be cleaved, for example using an endonuclease, heat treatment or increasing the pH of the buffer. The products from the amplification contain an end having a fixed sequence derived from the adaptor. If the locus specific primers have a fixed as well as a variable region, then the resultant amplified fragments can be further amplified and analysed using a standard pair of fixed primers. Alternatively a further adaptor can be attached to the 'mixed' or locus ends of the amplified fragments. The adaptor can be attached using a ligase. Alternatively, the adaptor can be attached using a polymerase, for example a template independent polymerase. The template independent polymerase can be terminal transferase (TdT). Thus a second (or third) adaptor, or 'fixed sequence' can be attached to the copied fragments. The second (or third) fixed sequence adaptor can therefore either be present as part of each of the variable the locus primers, or attached after the amplification/selection steps.
Where the sample is amplified before targeting selection, the inventors have found it important to use a second adaptor that is different from the 'fixed sequence' adaptor end on the loci primers. If the 'amplification' adaptor and fixed sequence adaptor are the same, cross hybridisation occurs between the loci primer ends and the library adaptors. This causes nonspecific binding of probes and lots of off-target reads (diluting enrichment). Whilst this can be overcome to some extent by supplementing the hyb mix with "suppression oligos" that are complementary to the library adaptor sequences, this increases the complexity of the mix. The inventors herein have developed an alternative solution by using a different adaptor for the second adaption step that is unrelated to the loci-end. This means suppression oligos are not required.
The method results in products having known regions at either end and copies of the nucleic acid sample fragments centrally between the known ends. The only fragments having known adaptors are ones selected using the locus hybridisation and extension. Fragments not having the desired locus remain mono-adapted and are thus lost from the analysis. The copied fragments can therefore be amplified using primers complementary to the two adaptors or copies thereof. A sequencing step can be carried out on the amplified mixture.
In order to use a template independent polymerase, the adaptor may contain a triphosphate moiety. The triphosphate moiety may attach to the 3 '- hydroxyl of the fragmented sample strands. The triphosphate moiety may be attached to the 5'- end of the nucleic acid adaptor via a linker. The linker may contain a nucleotide having a ribose or deoxyribose moiety, with the oligonucleotide adaptor attached via the nucleotide base. The adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded or double stranded. The double stranded adaptor has at least one overhanging single stranded region, and may have two or three overhanging single stranded regions. At least one of the overhangs serves to act to hybridise to the end of the single stranded fragments to which the adaptor is to be attached, and acts as a site which can undergo polymerase extension to make the attached single stranded fragments double stranded. The adaptors can be 'forked' adaptors having regions which are non-complementary as well as regions which are complementary.
The adaptor may take the form of a hairpin. A hairpin is a nucleic acid sequence containing both a region of single stranded sequence (a loop region) and regions of self-complementary sequence such that an intra-molecular duplex can be formed under hybridising conditions (a stem region). The stem may also have a single stranded overhang. The overhang may contain a degenerate sequence or may be a region of known sequence.
Also disclosed herein is a method comprising attaching a single stranded adaptor to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono- adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to the adaptor.
Disclosed herein is a method of joining a first single stranded oligonucleotide and a single stranded oligonucleotide adaptor using a template independent nucleic acid polymerase enzyme and selectively amplifying a subset of the joined products using a primer mixture containing a plurality of locus specific primers and a primer complementary to the adaptor.
Also disclosed herein is a method comprising attaching a hairpin to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono-adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to a portion of the hairpin.
Disclosed herein is a method of joining a first single stranded oligonucleotide and an oligonucleotide adaptor using a template independent nucleic acid polymerase enzyme and selectively amplifying a subset of the joined products, wherein the oligonucleotide adaptor takes the form of a hairpin having a single stranded region and a region of self- complementary double stranded sequence capable of forming a duplex under hybridising conditions.
The hairpin adaptor has an extendable 3' end. Thus extension of the 3 ' end allows copying of the adapted fragments. The method may include a step of using a nucleic acid polymerase to extend the 3 '-end of the oligonucleotide adaptor to produce a copy of the fragments having the hairpin adaptor attached thereto. The first single stranded oligonucleotides are fragments derived from a nucleic acid sample. The fragments can be obtained using chemical or enzymatic cleavage of the sample. The fragments can be obtained using bisulfite treatment. The fragments can be obtained as products of the cross-link reversal chemistry employed during DNA extraction from FFPE fixed tissue. The fragments can be obtained as products of aged and heavily degraded samples.
If the fragments of the sample average say 200 -400 bases in length, the locus and extension may give rise to fragments of having say 100-200 bases of unknown sequence. Attachment of the adaptor, followed by extension gives rise to products having say 100-200 base pairs of double stranded sequence, linked at one end by a loop of single stranded sequence from the hairpin adaptor. The extension should give rise to a blunt-ended product, including the complement of the locus primer and any universal region attached thereto. If desired, a further adaptor can be attached to the end of the extended copy. The universal region of the locus primers, and/or the further adaptor results in products having known regions at either end and copies of the nucleic acid sample fragments centrally between the known ends. The copied fragments can therefore be amplified using primers complementary to the two adaptors/universal regions or copies thereof. A sequencing step can be carried out on the amplified mixture.
Disclosed herein is also a population of locus specific nucleic acid primers wherein each member of the population contains a common universal sequence and one of a plurality of locus-specific regions wherein each locus specific region contains only the nucleic acid bases A, G and T such that the locus is complementary to copies of a sample treated with bisulfite. Bisulfite treatment results in a sample having very few residual C bases. Copies of the fragments have very few G bases, and so the primers can be free of the C nucleotide. The primer can contain C bases in the universal region, but the primer regions which are locus specific and which vary between different members of the population can be C-firee (i.e. contain only A, G and T).
The primers can be designed according to any one or more of the following criteria. Approximately 50 bases in length; for example 40-60 bases. Melting temperature approximately 55°C, for example 50-60°C. No CpG dinucleotides in the 3 '-most 20 bases at the head of the loci primer. 3 or fewer CpG dinucleotides in the 5 '-most 30 bases at the tail of the loci primer.
Disclosed herein is also a kit for use in selecting fragments from a nucleic acid sample, the kit comprising a plurality of locus specific primers, a hairpin adaptor and a single primer complementary to a portion of the hairpin adaptor or a copy thereof. The hairpin polynucleotide may have a triphosphate moiety at the 5 '-end. The kit may further include a terminal transferase. Other components, including instructions, can be added to the kit as described herein. The kit may contain the plurality of locus specific primers having only A, G and T bases, as described above. The kit may be suitable for labelling both DNA and RNA fragments. The kit may contain two template independent polymerases. The kit may contain both terminal transferase and polyadenylate polymerase (PAP) or polyU polymerase (PUP).
Detailed description
The method herein describes a number of features different to prior art methods of nucleic acid selection. These include:
The nucleic sample is either fully or partially single stranded or is fragmented as a first part of the process. At this stage the sample may be the native biological sample (for example raw genomic DNA, RNA, micro-RNA). The sample has not undergone any amplification or adaptor attachment steps prior to fragmentation, so the potential for selection or amplification bias is reduced.
The nucleic acid molecules or sample fragments are single stranded, and undergo a step of attaching an adaptor selectively at one end to produce mono-adapted fragments. Such mono- adapted fragments have one end of fixed, known sequence from the adaptor, and one variable, unknown end from the sample mixture. Bisulfite treatment of a sample which already contains the adaptors means that the majority of the samples which contain adaptors at both ends are lost. The majority of the sample fragments will not contain two known ends, and therefore can not be subsequently amplified. It is therefore advantageous that the adaptors are attached after the fragmentation step.
In one method of the invention, the locus specific primers are hybridised directly to the mono-adapted fragmented sample. There are no amplification steps such as whole genome amplification prior to selection. There is therefore no possibility of amplification bias, or mis- priming to portions of any amplification primers, as no amplification primers are present prior to the point the primers are hybridised.
Alternatively the sample can be amplified prior to or as part of the selection in order to obtain sufficient material to select.
Disclosed herein is a method comprising attaching a single stranded adaptor to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono-adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to the adaptor.
Disclosed herein is a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
a. taking a first population of single stranded nucleic acid molecules;
b. attaching a first oligonucleotide adaptor to the 3' end of each of the strands in the sample;
c. making the first population of single stranded molecules double stranded;
d. either,
i) attaching a second adaptor to the double stranded molecules, amplifying the double stranded molecules and denaturing the double stranded molecules to produce a second population of single stranded molecules; or
ii) cleaving the first adaptor and removing the first population of single stranded nucleic acid molecules to produce a second population of single stranded molecules;
e. hybridising a population of locus specific primers to a subset of the second population of single stranded molecules; f. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the first adaptor or a fragment thereof;
g. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps e and g occur simultaneously or sequentially; and
h. repeating steps e f and g one or more times.
Without a step of amplification prior to hybridisation and enrichment, the amount of material available for enrichment can be too limited. Therefore also included herein is an optional amplification step prior to the hybridisation of the locus primers. Included herein is a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
a. taking a population of single stranded nucleic acid molecules,
b. attaching a first oligonucleotide adaptor to the 3' end of each of the strands in the sample,
c. making the population of single stranded molecules double stranded,
d. attaching a second adaptor to the double stranded molecules,
e. amplifying the double stranded molecules,
f. denaturing the double stranded molecules,
g. hybridising a population of locus specific primers to a subset of the denatured molecules,
h. extending the hybridised locus specific primers to produce extension products where each extension product includes a copy of the first adaptor; and
i. amplifying the extension products.
Disclosed herein is a method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising the steps of; a. taking a population of single stranded nucleic acid molecules;
b. attaching an oligonucleotide adaptor to the 3' end of each of the strands in the sample; c. hybridising a population of locus specific primers to a subset of the fragments or a copy thereof;
d. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the adaptor; e. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps d and e occur simultaneously or sequentially; and
f. repeating steps d and e one or more times.
Disclosed herein is a method comprising attaching a hairpin to one end of a library of single stranded nucleic acid fragments, amplifying a subset of the mono-adapted fragments using a primer mixture containing a plurality of locus specific primers and a primer complementary to a portion of the hairpin.
The method may include the steps of
a. fragmenting the sample to produce a population of single stranded sample fragments; b. attaching an oligonucleotide adaptor to the 3 ' end of each of the strands in the sample wherein the adaptor is a hairpin containing a cleavage site;
c. extending the 3' end of the hairpin to make the fragments double stranded;
d. cleaving the hairpin to produce single stranded copies of the fragments having an adaptor fragment at one end, where the adaptor fragment is a portion of the hairpin;
e. hybridising a population of locus specific primers to a subset of the fragment copies; f. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the adaptor fragment;
g. hybridising a single primer to the copy of the adaptor fragments and extending each of the extension products using the single primer, where steps f and g occur simultaneously or sequentially; and
h. repeating steps f and g one or more times.
Whilst other steps may additionally be included or interspersed into the method steps shown above, steps a-f or a-i should be carried out in the order shown.
The primers are loci specific, meaning that they hybridise to a single location in the nucleic acid sample of interest. The primers can be of different lengths in order to normalise melting temperatures. The loci specific portion can be a length to ensure accurate hybridisation to a single location in the sample. The loci specific portion can be for example 30-60 bases in length. The adaptor region can be a further 10-40 bases in length. The primers can be between 40-100 bases in length. The loci region can be around 50 bases in length, and the adaptor can be for example 20 bases in length
Fragmentation of the sample can be caused by bisulfite treatment. Bisulfite treatment converts non-methylated cytosine bases to uracil bases. The locus specific primers can be complementary to a copy of these fragments (thus have the same sequence as regions of the fragments). Where the sample is bisulfite treated and copies thereof produced via extension of an attached hairpin, the locus specific region can contain only the nucleic acid bases A, G and T such that the locus is complementary to copies of (i.e. the same as region of) a sample treated with bisulfite. In order to identify sites of cytosine methylation, the locus can contain all four bases. The population of locus specific primers can contain both primers with A, G and T and primers with A, C, G and T. Primers having both C and A in equivalent positions can be used to identify specific methylation sites if required. Alternatively the methylation sites of interest may be in the extension region adjacent to the 3 ' end of the primer. The locus specific primers can be chosen bioinformatically to cover regions of interest. For bisulfite treated samples, the primers can be chosen to locate near to CpG islands or potential methylation sites of interest. Extension of the primers can be chosen to read through one or more CpG, CpH, CpA or CpN locations. Thus each primer can consist of A, G and T bases in the locus specific region, and A, C, G and T bases in the universal region. The universal region should be to the 5' side of the locus region such that 3 ' end of the primer is hybridised and suitable for extension. Each locus can be close to a CpG location in the sample.
Bisulfite treatment causes the strands to be cleaved such that both the 5' and 3' ends of the strands carry phosphate groups, therefore the 3 ' end of the primers is the only free 3 ' -OH component on the reaction (other than the NTP' s). In order to attach an adaptor to the 3 ' end, the phosphate groups can be removed, for example using a phosphatase or a kinase such as polynucleotide kinase (PNK).
Where the adaptor is a hairpin, upon extension with NTP' s, the extended hairpin is the only oligonucleotide with a 3 '-hydroxyl moiety. The hairpin can contain a cleavage site, cleavage of which can be used to make the sample single stranded upon denaturation of the cleaved samples. The cleavage site may be, for example, one or more uracil bases. The extended strand of the hairpin is the correct orientation (i.e. adapted at the 5' end) for a locus specific primer to hybridise and extend, the original sample fragments having the remainder of the hairpin at the 3' end are lost from the mixture as they are not amplified.
The length of the extension products is generally determined by the length of the fragments and the position of hybridisation. The extension will continue either until the end of the adaptor or hairpin adaptor is reached or a site is reached which does not permit incorporation, either for example because the nucleoside is abasic, or a limited selection of nucleoside triphosphates (less than 4) is used. The length of the extension is not particularly significant, and can be for example on average 10-200 fragment bases per molecule plus the length of the hairpin adaptor.
The extension reaction may be carried out using a nucleic acid polymerase. The extension reaction may be carried out using nucleotide triphosphates. The extension can be carried out using four nucleotide triphosphates.
The extension process can be carried out via thermocycling such that multiple extension products are derived from each fragment. Alternative isothermal processes of denaturation and re-annealing can also be carried out such that the extended products are removed, and fresh primers are hybridised.
The location of the base(s) being analysed is determined by the identity of a specific primer. More than one base can be analysed per extended primer. The extension products an be used to analyse nucleotide changes, for example single nucleotide polymorphisms (SNP's) or methylation status, for example whether C cases have been converted to U upon bisulfite treatment. Furthermore the multiple base extensions can give information in relation to deletions or insertions of one or more bases.
The term loci specific or locus specific means that the primer hybridises selectively to a single location, or loci, in the nucleic acid sample. The method can be carried out using a large number of different primers which can be pooled prior to hybridisation with the sample. There is no particular limit to the number of loci analysed. The method can work robustly with millions of loci targeting probes. The population of locus specific primers can be at least 50 different sequences. The population of locus specific primers can be at least 100 different sequences. The population of locus specific primers can be at least 1000 different sequences. The population of locus specific primers can be at least 10,000 different sequences. The method can be carried out such that at least 100,000 locations (primers) are analysed per sample. The method can be carried out such that at least 200,000 locations (primers) are analysed per sample. The method can be carried out such that at least 300,000 locations (primers) are analysed per sample. The method can be carried out such that at least 400,000 locations (primers) are analysed per sample. The population of locus specific primers can be at least 1,000,000 different sequences.
The location(s) in the sample to be identified is/are in the vicinity of a unique primer. The base(s) to be interrogated should be at the 3 '- side of the primer such that nucleotides can be incorporated complementary to the base(s) being analysed. The base(s) to be interrogated may be immediately 3 ' of the primer such that the first incorporation is being studied, or may be within 2-30 bases of the end of the primer. The interrogated bases can be in different locations for different primers.
The primers can be of different lengths in order to normalise melting temperatures. The primers can be between 15-100 bases in length. Primers having higher levels of A and T bases can be longer than primers having higher levels of C and G bases. The primers should be specifically hybridised at the temperature required for the polymerase extension.
The primers can be extended using a suitable nucleic acid polymerase. The polymerase may be a DNA polymerase. The polymerase may be active at room temperature, or may be a thermophilic polymerase. The temperature of the extension reaction can be chosen based on the desired specificity of the primer hybridisation reactions and the length of the primer sequences. The temperature of the extension reaction can be for example between 30-72 °C. The temperature of the extension reaction can be for example between 50-72 °C.
The nucleic acid samples are prepared as single stranded, which are then hybridised with the primers. The sample can contain a single adaptor at the 5' end prior to primer hybridisation. The hybridisation can be carried out by heating a population of double stranded fragments, thus melting them to be single strands, and allowing the mixture to cool. Alternatively the sample can be prepared as a single stranded sample without heat denaturing. In cases where the sample is exposed to bisulfite, the nucleic acid fragments in the sample will be single stranded. The fragments can be made single stranded by other chemical treatments, for example exposure to hydroxide. In the case of cleaved hairpins, the double stranded fragments may be made single stranded by heating or chemical treatment, either in the presence of the locus specific primers or prior to addition of the primers.
In order to sequence the extended primers, universal adaptors may be attached to the ends. The universal adaptors allow amplification using a single pair of primers complementary to the adaptor. Many methods exist for the preparation of samples of double-stranded DNA, for example for sequencing (e.g. Illumina TruSeq and NextEra, 454, NEBnext, Life Technologies etc).
After the extension, the sample may be processed in double stranded form or the sample may be treated (for example using heat denaturation) to give rise to a single stranded form.
The adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded or double stranded. The double stranded adaptor has at least one overhanging single stranded region, and may have two or three overhanging single stranded regions. The overhang serves to act to hybridise to the end of the extended locus primers to which the adaptor is to be attached, and acts as a site which can undergo polymerase extension to make the attached single stranded extended locus primers double stranded. The adaptors can be 'forked' adaptors having regions which are non-complementary as well as regions which are complementary.
Where the adaptor is hybridised to the fragments, the attachment of the adaptor can be carried out using a template dependent polymerase. Any polymerase suitable for the incorporation of a nucleotide triphosphate can be used. The adaptor can be thought of as a nucleotide triphosphate attached to an oligonucleotide duplex. Thus the adaptor carries its own template.
The adaptor or oligonucleotide 5 '-triphosphate adaptor may have a region of self- complementarity such that the second oligonucleotide may take the form of a hairpin. The hairpin may have 3 '-overhang suitable for polymerase extension. The term single stranded therefore includes a single strand which is in part single stranded, and in part double stranded at certain temperatures, but which can be made single stranded by increasing the temperature.
The adaptor may have one or more regions for indexing such that different oligonucleotides can be attached to different samples, thereby allowing sample pooling. The adaptor may have one or more modifications which allow site specific strand cleavage. The adaptor may have one or more uracil bases, thereby allowing site specific cleavage using enzyme treatment.
The locus specific primers or the adaptor may be attached to a solid surface, or may contain a modification allowing for subsequent immobilisation or capture. The adaptor attachment may be carried out on a solid support, or the joined products may be captured onto a surface after joining. The locus specific primers or the adaptors may carry a moiety for surface capture, for example a biotin moiety. Alternatively the attachment may be covalent. The primers may be immobilised on a solid support, and used to capture the single stranded mono-adapted oligonucleotide fragments. Alternatively the adaptors may be immobilised on a solid support, and used to capture the fragments as the first stage of the process.
The locus specific primers or the adaptor may be DNA, RNA or a mixture thereof. Where the adaptor contains two strands, one strand may be DNA and one strand may be RNA.
An aspect of the invention described herein provides a method of joining two oligonucleotides using a template independent nucleic acid polymerase enzyme such as terminal deoxynucleotidyl transferase (TdT). Terminal deoxynucleotidyl transferase (TdT), also known as DNA nucleotidylexotransferase (DNTT) or terminal transferase, is a specialized DNA polymerase which catalyses the addition of nucleotides to the 3' terminus of a DNA molecule. Unlike most DNA polymerases, it does not require a template. TdT typically adds nucleotide 5 -triphosphates onto the 3 '-hydroxyl of a single stranded first oligonucleotide sequence. The invention as described herein uses a second oligonucleotide carrying a 5 '-triphosphosphate which can be attached to the first oligonucleotide sequence, thus enabling two single stranded oligonucleotides to be joined together, catalysed by TdT. Thus the enzyme can be used to link two oligonucleotide strands, rather than simply adding individual nucleotides.
Polyadenylate polymerase (PAP) is an enzyme involved in the formation of the polyadenylate tail of the 3' end of mRNA. PAP uses adenosine triphosphate (ATP) to add adenosine nucleotides to the 3 ' end of an RNA strand. The enzyme works in a template independent manner. A further aspect of the invention involves the use of PAP to join two oligonucleotides. In the use of PAP, one or more of the oligonucleotides may be RNA rather than DNA. Similarly poly(U) polymerase catalyzes the template independent addition of UMP from UTP or AMP from ATP to the 3 ' end of RNA.
The term template independent nucleic acid polymerase enzyme includes any polymerase which acts without requiring a nucleic acid template. The term template independent nucleic acid polymerase enzyme includes terminal deoxynucleotidyl transferase (TdT), Polyadenylate polymerase (PAP) and poly(U) polymerase (PUP). The template independent nucleic acid polymerase enzyme can be PAP. The template independent nucleic acid polymerase enzyme can be terminal transferase. The oligonucleotides joined can be RNA or DNA, or a combination of both RNA and DNA. The oligonucleotides can contain one or more modified backbone residues, modified sugar residues or modified nucleotide bases.
Template independent nucleic acid polymerase enzymes may be sensitive to the bulk of substituents attached to the ribose 3 '- position. The standard substrates for these enzymes are nucleotide triphosphates in which the ribose 3 '- position is a hydroxyl group. In order to increase the tolerance of the enzyme for sterically larger substituents at this position, the enzyme may be engineered using suitable amino acid substitutions to accommodate any increase in steric bulk. The term template independent nucleic acid polymerase enzyme therefore includes non-naturally occurring (engineered) enzymes. The term template independent nucleic acid polymerase enzyme includes modified versions of terminal transferase or PAP. Terminal transferase, PUP or PAP may be obtained from commercial sources (e.g. New England Biolabs).
The method described herein adds a single stranded oligonucleotide with a 3 ' hydroxyl to a single stranded oligonucleotide with a 5 '-triphosphate moiety. The triphosphate moiety can be attached directly to the 5'-hydroxyl of the second oligonucleotide. In such cases the 5 '- oligonucleotide triphosphate can react directly with the 3 '-hydroxyl group of the first oligonucleotide to form a single stranded oligonucleotide containing the first and second sequences linked together via a standard 'natural' phosphomonoester moiety. Such an oligonucleotide can be copied using a polymerase as there are no unnatural linking groups between the first and second oligonucleotides. The use of engineered template independent polymerase enzymes may increase the tolerance for steric bulk at the 3 '-position of the triphosphate nucleotide, and hence allow the use of oligonucleotide strands attached directly to the 3 '-hydroxyl of a nucleotide triphosphate. Alternatively the triphosphate can be attached through a linker moiety. Linker moieties can be any functionality attached to the terminal 5'hydroxyl of the oligonucleotide strand. The linker moiety can include one or more phosphate groups. The linker may contain a ribose or deoxyribose moiety. The linker may contain one or more further nucleotides. The nucleotides, or the ribose or deoxyribose moieties may be further substituted. The linker may contain a ribose or deoxyribose moiety in which the oligonucleotide is attached to the 2- position of the ribose. The linker may contain a nucleotide in which the remainder of the oligonucleotide is attached via the nucleotide base.
Where the generic description 'linker' is used, the linker may employ one or more carbon, oxygen, nitrogen or phosphorus atoms. The linker acts merely to attach the functional triphosphate moiety to the remainder of the oligonucleotide.
The joined oligonucleotides may be copied using a nucleic acid polymerase. The linker should be able to permit a nucleotide polymerase to bridge though the linker in order to copy the strands after joining. The action of the polymerase may be enhanced by using a hybridised primer which can bridge across the linker region. The primer can be designed with a suitable length of sequence to space across the linker region. The sequence can be degenerate/random or simply be a suitable length of known sequence in order to bridge across any gap caused by the linker region.
The length of sequence used to bridge the gap can be designed depending on the choice of linker. The sequence can be used as a tag for individual fragments. The tag can be used to assess the level of bias introduced by any amplification reactions. If the tags are say 6 mers of random sequences, there at 4Λ6 (4096) different variants of different sequence. From a population of fragments from a biological sample, it is highly unlikely that two fragments of the same 'biological' sequence will be joined to a tag with the same 'tag' sequence. Therefore any examples where the fragments and tag are over-represented in the sequencing reaction occur because the particular individual fragment is over-amplified during the PCR reaction when compared to other fragments in the population. Thus the use of 'tags' of variable sequence can be used to help normalise the effects of amplification variability.
The tags can also be used to help identify sequences from different sources. If adaptors are used with different sequences for different sources of biological materials, then the different sources can be pooled but still identified via the tag when the tags are sequenced. Thus the disclosure herein includes the use of two or more different populations of adaptors for the multiplexing of the analysis of different samples. Disclosed herein therefore are kits containing two or more adaptors of different sequence.
The oligonucleotide with the 5 '-triphosphate may be blocked at the 3 ' end to prevent self joining. The blocking moiety may be a phosphate group or a similar moiety. Alternatively the 3 ' end may be a dideoxy nucleotide with no 3'-OH group. In order to allow subsequent extension of the 3 '- end if desired, any blocking group can be removed. The methods can include the step of treatment with a suitable kinase to remove a 3'- phosphate moiety. The kinases may be PNK or any suitable kinase.
The oligonucleotide with the 5 '-triphosphate may be produced chemically or enzymatically. A suitable nucleotide 5 '-triphosphate may be chemically coupled to a suitable oligonucleotide using suitable chemical couplings. For example, as shown in the examples, the nucleotide triphosphate may contain an azido (N3) group and the oligonucleotide may contain an alkyne group such as DBCO. Alternatively a suitable oligonucleotide monophosphate may be turned into a triphosphate either chemically or enzymatically.
The sequence of the 5 '-triphosphate adaptor oligonucleotide depends on the specific application and suitable adaptor oligonucleotides may be designed using known techniques. A suitable adaptor oligonucleotide may, for example, consist of 20 to 100 nucleotides. The sequence of the adaptor may be selected to be complementary to a suitable amplification/extension primer. The adaptor may contain a region of monobase sequence such as for example poly A or poly T.
The adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded or double stranded or a combination thereof. The double stranded adaptor has at least one overhanging single stranded region, and may have two or three overhanging single stranded regions. The overhang serves to act to hybridise to the end of the extended locus primers to which the adaptor is to be attached, and acts as a site which can undergo polymerase extension to make the attached single stranded extended locus primers double stranded. The adaptors can be 'forked' adaptors having regions which are non-complementary as well as regions which are complementary. Attachment of the 5 '-triphosphate oligonucleotide may give rise to a join which is not a natural phosphodiester linkage. Such joins may not be substrates for nucleic acid polymerases. In such cases, the use of 3 '-overhangs, either as hairpins or double stranded adaptors is advantageous as the linking region can be 'bridged' using an oligonucleotide primer sequence which is internal or part of the adaptor. Hybridisation of a primer suitable for extension would also require such an internal spacer, and this lowers the affinity and specificity of the primer hybridisation, whereas no such issues arise where the adaptor has an 'internal' primer which is already hybridised (or in the case of hairpins integral). The attachment of a single 'hairpin' which can be used as both the known end and the extendable primer when preparing a library is therefore advantageous over the attachment of a single known end followed by the hybridisation of a second primer. The pre-formed, or intramolecular hybridisation spans the unnatural join, and allows efficient extension.
The oligonucleotide adaptor, or oligonucleotide 5 '-triphosphate adaptor may be single stranded in portions, and have a region of self-complementarity such that the second oligonucleotide may take the form of a hairpin. The adaptor may take the form of a hairpin having a single stranded region and a region of self-complementary double stranded sequence. The hairpin may have 3 '-overhang suitable for polymerase extension. The overhang may stretch across the triphosphate 'linker' region at the 5' end, thus avoiding any issues relating the presence of the 5 '-modification required for TDT incorporation. The self- complementary double stranded portion may be from 5-20 base pairs in length. The overhang may be from 1-10 bases in length. The overhang may contain one or more degenerate bases. The sequence may contain a mixture of bases A, C and T at each position (symbolised as H (not G)). H may be used in cases where the sample is bisulfite treated, and thus does not contain any C bases to which the G would be complementary. The overhang may consist of 1-10 H bases. The overhang may be 2-8 bases, which may be H. The overhang may have a 3 '-phosphate. The overhang may have a 3 '-OH. The overhang could be a known, standard sequence.
The oligonucleotide adaptor may have one or more regions for indexing such that different oligonucleotides can be attached to different samples, thereby allowing sample pooling. Alternatively the region for indexing may be located as part of the locus primer oligonucleotides. The adaptor oligonucleotide may have one or more modifications which allow site specific strand cleavage. The adaptor oligonucleotide may have one or more uracil bases, thereby allowing site specific cleavage using enzyme treatment with UDG and an endonuclease. The term adaptor fragment is used to refer to the portion of the adaptor remaining after the cleavage step. If the specific cleavage site is at the end of the adaptor, the fragment remaining may be only one base shorter than the original adaptor. Alternatively the cleavage site may be more central such that a portion of the adaptor remains on the original fragment strand after cleavage, as well as a portion of the adaptor being attached to the desired copy of the fragment.
Copies of fragments may be produced by extending the 3 '-end of the attached double stranded adaptor or hairpin. Where the adaptor is a hairpin, the extension of the hairpin produces an extended hairpin. The extended hairpin can also be described as a double stranded nucleic acid having one end joined. Upon denaturation, the extended hairpin becomes a single stranded molecule, but the length of the double stranded portion (for example at least 100 base pairs) means that the sample rapidly hybridises to form the extended hairpin.
Alternative methods of attaching adaptors include the use of ligases. In order to use ligases, the sample may be left double stranded. The sample may be treated to give an overhanging base complementary to an overhang on an adaptor. Thus the blunt ended duplex may be treated to add for example a single 3 ' nucleotide which can then act as an end complementary to an adaptor. Thus the ligation may be blunt ended or cohesive.
Where blunt ends are used, the method needs to be chosen to avoid the formation of adaptor- adaptor ligation. In cases where the samples are bisulfite treated, the ends of the strands are phosphates, and may thus be amenable to adaptors having both 3 ' and 5' phosphates. For example a hairpin adaptor having a 3 ' phosphate and a 5' phosphate could ligate to the 3 '- hydroxyl of a double stranded nucleic acid, but not the complementary 5' phosphate. The adaptor, having no free hydroxyl groups could not ligate to itself. Thus the desired product can be formed where the extended strand is attached to the adaptor, but the strand derived from the biological sample (having a 5 '-phosphate) remains unligated. After attachment of the hairpin, the 3' end of the hairpin may require deblocking. If the hairpin contains, for example, a 3 ' phosphate, the 3' phosphate can be removed (for example using a kinase such as PNK) and the released 3 '-OH can be used in an extension to copy the fragment. Thus a double stranded product linked at one end via the adaptor hairpin is produced.
The method may be used in order to prepare samples for nucleic acid sequencing. The method may be used to sequence a population of synthetic oligonucleotides, for example for the purposes of quality control. Alternatively, the first oligonucleotides may come from a population of nucleic acid molecules from a biological sample. The population may be fragments of between 100-10000 nucleotides in length. The fragments may be 200-1000 nucleotides in length. The fragments may be of random variable sequence. The order of bases in the sequence may be known, unknown, or partly known. The fragments may come from treating a biological sample to obtain fragments of shorter length than exist in the naturally occurring sample. The fragments may come from a random cleavage of longer strands. The fragments may be derived from treating a nucleic acid sample with a chemical reagent (for example sodium bisulfite, acid or alkali, chemical denaturants such as formamide and urea) or enzyme (for example with a restriction endonuclease or other nuclease). The fragments may come from a treatment step that causes double stranded molecules to become single stranded, for example heating.
Methods of the invention may be useful in preparing a targeted population of nucleic acid strands for sequencing, for example a population of bisulfite-treated single-stranded nucleic acid fragments. Bisulfite treatment produces single-stranded nucleic acid fragments, typically of about 250-1000 nucleotides in length. The sample may be fragmented using treatment with bisulfite by incubation with bisulfite ions (HSO3 ) or metabisulfite ions (S205 2-). The use of bisulfite ions or metabisulfite ions to convert unmethylated cytosines in nucleic acids into uracil is standard in the art and suitable reagents and conditions are well known. Numerous suitable protocols and reagents are also commercially available (for example, EpiTect™, Qiagen NL; EZ DNA Methyl ation™ Zymo Research Corp CA, CpGenome Turbo Bisulfite Modification Kit, Millipore; TrueMethyl™. Cambridge Epigenetix, UK. Bisulfite treatment converts cytosine and 5-formylcytosine residues in a nucleic acid strands into uracil. However, a small proportion of cytosine and 5-formylcytosine residues are eliminated by bisulfite treatment rather than converted to U, leading to the formation of abasic sites in the nucleic acid strands, which tends to cause strand cleavage.
In other embodiments, a population of DNA strands having one or more abasic sites may be produced, and thereby the sample fragmented, by subjecting a population of nucleic acid molecules to acid hydrolysis. The population may be subjected to acid hydrolysis by incubation at an acidic pH (for example, pH 5) and elevated temperature (for example, greater than 70 °C). A proportion of the purine bases in the nucleic acid strands will be lost, to generate abasic sites. The number of abasic sites formed depends on the pH, concentration of buffer, temperature and length of incubation.
In other embodiments, a population of DNA strands having one or more abasic sites may be produced by treating the population of nucleic acid strands with uracil-DNA glycosylase (UDG). The population may be treated with UDG by incubation with UDG at 37°C. UDG excises uracil residues in the nucleic acid strands leaving abasic sites. UDG may be obtained from commercial sources.
Any sample containing uracil bases, for example the hairpin adaptors or extension products derived therefrom may be cleaved using uracil-DNA glycosylase (UDG). The population may be treated with UDG by incubation with UDG at 37 °C. UDG excises uracil residues in the nucleic acid strands leaving abasic sites. UDG may be obtained from commercial sources. Abasic sites can be cleaved using an endonuclease mixture. Mixtures of enzymes suitable for cleaving nucleic acid strands containing uracil bases are commercially available and standard in the art.
The population of nucleic acid molecules may be a sample of DNA or RNA, for example a genomic DNA sample. Suitable DNA and RNA samples may be obtained or isolated from a sample of cells, for example, mammalian cells such as human cells or tissue samples, such as biopsies. In some embodiments, the sample may be obtained from a formalin fixed parafin embedded (FFPE) tissue sample. Suitable cells include somatic and germ-line cells. The targeting methods described herein are particularly advantageous where the amount of sample is limited. The sample may be an ancient nucleic acid sample. The sample may be an isolate from a cell free nucleic acid such as circulating cell free DNA. The sample may be a maternal sample where the presence of fetal nucleic acids are detected. The sample may be aiming to detect the presence of circulating tumour cells. The sample may be derived from blood or other biological source such as cerebrospinal fluids.
The method may be advantageous in the simultaneous analysis of both DNA molecules and RNA molecules from a sample. The addition of the adaptor to a sample of single stranded DNA means that adaptors can be added to both DNA and RNA whilst the two species remains in the same sample. An adaptor can be added to the end of both RNA and DNA in a mix of the two nucleic acids (for example in cell extract where neither the RNA nor the DNA has been purified away from its partner). Such a method allows mapping of the epigenome alongside the transcriptome, and having a method that allows us to do this in parallel in the same reaction would be advantageous. The adaptors can be of partially different sequence such that the identify of the molecule can be identified as being DNA or RNA. For example, where an RNA adaptor is labelled with index Y and a DNA adaptor with index Z adding them in a mix with TdT and PAP, the DNA adaptor with index Z specifically adds to the DNA fragments and the RNA adaptor with index Y adds specifically to the RNA fragments at the same time in a single reaction. Thus the method of the invention allows the adaptation and selection of both DNA and RNA in a single composition.
The population may be a diverse population of nucleic acid molecules, for example a library, such as a whole genome library or a loci specific library.
Methods of the invention may be useful in producing populations of mono-adapted single stranded nucleic acid fragments i.e. nucleic acid strands having an adaptor oligonucleotide attached to their 3 ' termini. In some embodiments, populations of 3 ' adapted single stranded nucleic acid fragments may be used directly for sequencing and/or amplification.
The sequence of the adaptor oligonucleotide may be entirely known, or may include a variable region. The sequence may include a universal sequence such that each joined sequence has a common 'adaptor' sequence attached to one end. The attachment of an adaptor or a fragment thereof to one end of a pool of fragments of variable sequence means that copies of the variable sequences can be produced using a single 'extension' primer.
If a single molecule sequencing technique is employed, the targeted fragments may be analysed by hybridising the fragments onto a solid support carrying an array of primers complementary to the adaptor oligonucleotide sequence. Alternatively the joined fragments may have an adaptor modification at the 3'- end which allows attachment to a solid support.
The methods disclosed may further include the step of producing one or more copies of the locus primer extended oligonucleotides. The methods may include producing multiple copies of each of the targeted sequences. The copies may be made by hybridising a primer sequence opposite a universal sequence on the oligonucleotide adaptor fragment sequence, and using a nucleic acid polymerase to synthesise a complementary copy of the first single stranded sequences. The production of the complementary copy provides a double stranded polynucleotide.
The hairpin can contain a cleavage site (for example a uracil nucleotide). Upon cleavage, the hairpin becomes two strands rather than one, and is no longer a hairpin. Thus the sample becomes an adapted known end (from the adaptor fragment), and unknown region from the sample of interest (the sequence of which can be determined). The fragments are selected using the locus specific primers, which can contain a universal sequence. After extension of the locus specific primers, the double stranded polynucleotides can be amplified, either using further copies of the locus primers, or using primers complementary to both known ends.
Double stranded polynucleotides may be made circular by attaching the ends together. This may be useful in the generation of circular nucleic acid constructs and plasmids or in the preparation of samples for sequencing using platforms that employ circular templates (e.g. PacBio SMRT sequencing). In some embodiments, populations of circularised 3 ' adapted nucleic acid fragments produced as described herein may be denatured and subjected to rolling circle or whole genome amplification. Amplification of circular fragments can be carried out using primers complementary to two regions of the single adaptor sequence.
A second adaptor may be attached to a product after a second extension. The second adaptor may comprise a self-complementary double stranded region (i.e. a hairpin).
Described herein are kits and components for carrying out the invention. Disclosed is a kit for use in preparing a nucleic acid sample, the kit comprising a hairpin adaptor polynucleotide having a triphosphate moiety at the 5'-end, a population of locus specific primers and a single primer complementary to a portion of the hairpin adaptor or a copy thereof. The kit may contain a nucleotide 5-triphosphate adaptor having any of the features described herein.
Disclosed herein are kits containing two or more oligonucleotide adaptors of different sequence, each having a nucleotide 5-triphosphate. The two or more different sequences may include a fixed sequence capable of hybridising to an extension primer, and a variable sequence which acts as a tag to identify the adaptor (and hence the identify of the sample to which the adaptors are attached). The kit may be suitable for labelling both DNA and RNA fragments. The kit may contain two template independent polymerases. The kit may contain both terminal transferase and polyadenylate polymerase (PAP) or polyU polymerase (PUP).
Disclosed herein is a population of locus specific nucleic acid primers wherein each member of the population contains a common universal sequence and one of a plurality of locus- specific regions wherein each locus specific region contains only the nucleic acid bases A, G and T such that the locus is complementary to copies of a sample treated with bisulfite. Such primers may be included in a kit with a polynucleotide having a triphosphate moiety at the 5'- end. The primers may comprise 50 or more locus specific sequences in the population.
The products of any method step described herein can undergo parallel sequencing on a solid support. In such cases the attachment of universal adaptors to each end may be beneficial in the amplification of the population of fragments. Suitable sequencing methods are well known in the art, and include Ulumina sequencing, pyrosequencing (for example 454 sequencing) or Ion Torrent sequencing from Life Technologies™).
Populations of nucleic acid molecules with a 3 ' adaptor oligonucleotide and a 5' adaptor oligonucleotide may be sequenced directly. For example, the sequences of the first and second adaptor oligonucleotides may be specific for a sequencing platform. For example, they may be complementary to the flowcell or device on which sequencing is to be performed. This may allow the sequencing of the population of nucleic acid fragments without the need for further amplification and/or adaptation.
The first and second adaptor sequences are different. Preferably, the adaptor sequences are not found within the human genome, or other sample genome of interest. The nucleic acid strands in the population to be sequenced may have the same first adaptor sequence at their 3' ends and the same second adaptor sequence at their 5' ends i.e. all of the fragments in the population may be flanked by the same pair of adaptor sequences.
Suitable adaptor oligonucleotides for the production of nucleic acid strands for sequencing may include a region that is complementary to the universal primers on the solid support (e.g. a flowcell or bead) and a region that is complementary to universal sequencing primers (i.e. which when annealed to the adaptor oligonucleotide and extended allows the sequence of the nucleic acid molecule to be read). Suitable nucleotide sequences for these interactions are well known in the art and depend on the sequencing platform to be employed. Suitable sequencing platforms include Illumina TruSeq, LifeTech IonTorrent, Roche 454 and PacBio RS.
For example, the sequences of the first and second adaptor oligonucleotides may comprise a sequence that hybridises to complementary primers immobilised on the solid support (e.g. a 20-30 nucleotides); a sequence that hybridises to sequencing primer (e.g. a 30-40 nucleotides) and a unique index sequence (e.g. 6-10 nucleotides). Suitable first and second adaptor oligonucleotides may be 56-80 nucleotides in length. In the case where the second adaptor is a hairpin, the adaptor may be for example 5-20 bases of a first complementary sequence, a single stranded loop comprising a sequence that hybridises to the solid support and the sequencing primer (e.g. 50-70 nucleotides), optionally a unique index sequence (e.g. 6-10 nucleotides) and optionally one or more locations such as uracil for site specific cleavage, a second complementary sequence complementary to the first complementary sequence and optionally a 3 ' overhang (e.g. 1-10 bases). Thus the hairpin constructs may be 60 to 100 nucleotides or more in length.
Following adaptation and/or labelling as described herein, the nucleic acid molecules may be purified by any convenient technique. Following preparation, the population of nucleic acid molecules may be provided in a suitable form for further treatment as described herein. For example, the population of nucleic acid molecules may be in aqueous solution in the absence of buffers before treatment as described herein. In other embodiments, populations of nucleic acid molecules with a 3 ' adaptor oligonucleotide and optionally a 5' adaptor oligonucleotide, may be further adapted and/or amplified as required, for example for a specific application or sequencing platform.
Preferably, the nucleic acid strands in the population may have the same first adaptor sequence at their 3' ends and the same second adaptor sequence at their 5' ends i.e. all of the fragments in the population may be flanked by the same pair of adaptors, as described above. This allows the same pair of amplification primers to amplify all of the strands in the population and avoids the need for multiplex amplification reactions using complex sets of primer pairs, which are susceptible to mis-priming and the amplification of artefacts.
Suitable first and second amplification primers may be 20-25 nucleotides in length and may be designed and synthesised using standard techniques. For example, a first amplification primer may hybridise to the first adaptor sequence i.e. the first amplification primer may comprise a nucleotide sequence complementary to the first adaptor oligonucleotide; and a second amplification primer may hybridises to the complement of second adaptor sequence i.e. the second amplification primer may comprise the nucleotide sequence of the second adaptor oligonucleotide or to the universal sequence on the locus specific primers. Alternatively, a first amplification primer may hybridise to the complement of first adaptor sequence i.e. the first amplification primer may comprise a nucleotide sequence of the first adaptor oligonucleotide; and a second amplification primer may hybridise to the second adaptor sequence i.e. the second amplification primer may comprise the nucleotide sequence of the second adaptor oligonucleotide or the universal sequence on the locus specific primers.
In some embodiments, the first and second amplification primers may incorporate additional sequences.
Additional sequences may include index sequences to allow identification of the amplification products during multiplex sequencing, or further adaptor sequences to allow sequencing of the strands using a specfic sequencing platform.
Description of Figures
Figure 1 is an IVG screenshot showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions. The bottom track shown in blue indicates the expected loci primer annealing positions.
Figures 2 and 3 are IVG screenshots showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions. The bottom track shown in blue indicates the expected loci primer annealing positions.
Figure 4 shows a schematic of the method. An optional amplification can be carried out using a second adaptor prior to the selection using the locus specific primers. Methods with and without the optional amplification are shown.
Figures 5 and 6 are IVG screenshots showing read alignment and read coverage from two regions of the human genome generated using the method of example 3 (GAPDH region: Chrl2:6,53,000-6,541,000 and NANOG region: Chrl2:7,786,763-7,800,358) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions. The bottom track shown in blue indicates the expected loci primer annealing positions.
Figure 7 and 8 are IVG screenshots showing read alignment and read coverage from two regions of the human genome generated using the method of example 4 (EGFR region: Chr7:55,130,000-55, 133,987and CLEC12A region: Chrl2:9,969,868-9,972, 145) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions. The bottom track shown in blue indicates the expected loci primer annealing positions.
Experiments
EXAMPLE 1 : Preparation of loci targeting-extension libraries from native E.coli genomic DNA using post-hybridisation capture and cleanup prior to amplification
Step 1 : Probe design
A pool of 2000 ssDNA probes were designed to be complementary to the top and bottom strands of native E.coli genome (MG1655). Probes were 64-84 nucleotides long with 30-50 nucleotides being complementary to the target loci distributed randomly in the E.coli genome. All probes had Tm between 55 °C ± 5 °C and were labelled with Biotin at the 5' end. The final probes include a 5' adaptor sequence to enable compatibility with Illumina sequencing technology (5'-GTGACTGGAGTTCAGACGTGTGCTCTTCCTATCT).
Step 2: DNA Fragmentation
Native E.coli genomic DNA (strain K-12 sub-strain MG1655) was sheared to the length of approximately 350 bp using Covaris E-220 sonicator following manufacturer's instructions.
Step 3 : End repair
Fragmented DNA (500 - 1000 ng) was heat denatured at 95 °C for 3 minutes, and snap cooled on ice. The DNA ends were repaired with 20 units of T4 Polynucleotide Kinase (PNK, Enzymatics Y9040L) in 20 of lx Addition Buffer (lOOmM Tris-acetate, 1.25 mM CoAc2, 0.125 mg/mL BSA, pH 6.6). The reaction was incubated at 37 °C for 20 minutes, and then heat denatured at 95 °C for 3 minutes.
Step 4: Incorporation of universal short hairpin adaptor
The triphosphate hairpin adaptor was prepared as follows: the universal short hairpin oligo (1250 pmol, Biomers GMBH) with a 5'-DBCO modification was reacted with an azido-3'- deoxyadenosine-5 '-triphosphate (1000 pmol, JenaBiosciences) for 2 hours at 10 °C in 10 mM Tris-HCl (pH 7.0). The sequence of universal short hairpin adaptor is listed in Table 1. An aliquot of this hairpin adaptor (40 pmol) and 20 units of Terminal Transferase (TdT, Enzymatics, P7070L) were then added to incorporate the hairpin adaptor to the 3 '-end of the DNA (37 °C for 30 minutes). Reaction was stopped with 50 mM EDTA.
Table 1 : Universal Short Hairpin Adaptor sequence
Figure imgf000038_0001
Step 5 : Magnetic bead purification The products obtained were SPRI bead purified with 3 : 1 bead solution:DNA ratio (18 % PEG-8000, 1 M NaCl, 1 mM EDTA, 10 mM Tris-HCl (pH 8.0), and 0.1 % w/v Carboxy magnetic beads). Binding time was 10 minutes and washes performed with 80 % ethanol. DNA was eluted from the beads with ultra pure water.
Step 6: Complementary strand extension
The tagged DNA eluted from step 5 was mixed with the PNK/Klenow cocktail (10 units PNK, 5 units Klenow exo- (P7010L)), 30 nmol dNTP mix in lx Blue buffer (50 mM NaCl, 10 mM Tris-HCl, 10 mM MgCl2, 1 mM DTT pH 7.9) to dephosphorylate the 3' end of the attached hairpin adaptor, and to synthesis the full length, complementary bottom strand by extension from the annealed hairpin adaptor by Klenow (37 °C for 30 minutes, then 55 °C for 20 minutes). The extended products were further SPRI bead purified with 2: 1 bead solution:DNA ratio. DNA was eluted from the beads with ultra pure water.
Step 7: UDG digestion
The extended DNA was incubated with Thermolabile UDG (2.5 units, Enzymatics, G5020L) at 37 °C for 30 min to remove original strand and linearize the hairpin in lx Reaction Buffer (70 mM Tris-HCl, 10 mM NaCl, 1 mM EDTA, 0.1 mg/mL BSA pH 8.0 @25 °C), and then heat denatured at 95 °C for 5 minutes. DNA was purified using 3x SPRI beads as described in step 5, eluted from the beads with 10 μΤ, ultra pure water and quantified by Qubit ssDNA assay kit. The linearized DNA was used for hybridization capture.
Step 8: Probe hybridization
For hybridization, hybridization buffer (lOx SSPE, lOx Denhardt's, 10 mM EDTA, and 0.2 % SDS) was pre-warmed at 48 °C in a water bath. Library pools containing -500 ng DNA and blocking oligo (Table 2) were heated at 95 °C for 5 minutes and then held at 48 °C for 5 minutes in a thermal cycler before they were added to pre-warmed hybridization buffer, followed by addition of probe solution preheated to 48 °C for 2 minutes. All hybridization reactions were incubated at 48 °C for -64 hours in a PCR block.
Table 2: Blocking oligo sequence
Oligo name Sequence
Blocking 5' AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC-P04 3'
Figure imgf000040_0001
Step 9: Streptavidin beads purification
30 μΐ of Dynabeads M-280 Streptavidin (Life Technologies, 1 1205D) was washed three times according to manufacturer instructions with lxBW buffer (5 mM Tris-HCl (pH 7.5), 0.5 mM EDTA, 1 M NaCl). Beads were resuspended in 200 μΐ. lxBW buffer and hybridization reactions were added to the beads suspension and incubated for 30 min at RT with mixing. After binding, beads were washed twice at RT in 0.5 ml hybridization wash buffer I (lx SSC, 0.1 % SDS), followed by 3 wash steps at 48 °C in preheated 0.5 mL hybridization wash buffer II (0. lxSSC, 0.1 % SDS), then final wash with H2O, proceed to on- beads extension step.
Step 10: Post-capture on-beads extension
Beads from step 9 was resuspended in ultra pure H20, extension from the hybridised probe by Klenow (5 units Klenow exo-, Enzymatics) was carried out (37 °C for 30 minutes) in lx Blue Buffer (50 mM NaCl. 10 mM Tris-HCl, 10 mM MgCl2, 1 mM DTT pH 7.9) in the presence of 30 nmol dNTP to synthesise the full length, complementary strand.
Step 1 1 : Post-extension enrichment (IDX PCR)
The beads from step 10 were collected and washed 3 times with lx BW and once with H20, followed by enrichment PCR according to conditions outlined in Table 3. PCR primers used were NEBNext Multiplex Oligos for Illumina (New England BioLabs, E7335L and E7500L) see Table 6 for sequences.
Table 3 : Post-extension PCR enrichment
Volume
PCR component
(PL)
Bead/DNA suspension 15.4
5x VeraSeq buffer 5
10 mM dNTP mix 1
10 μΜ Universal PCR primer 1.5
10 μΜ Indexed PCR primer 1.5 VeraSeq High Fidelity Polymerase (Enzymatics, P751 1L) 0.6
TOTAL 25
Thermal cycling programme
Figure imgf000041_0001
Step 12: Next generation sequencing and data processing
Samples were pooled, denatured and loaded onto an Illumina MiSeq flowcell at 10 pM concentration. The samples were paired-end sequenced (2x 75cycle) using MiSeq v3 SBS chemistry according to the manufacturers standard protocol. Raw read fastq files were automatically demultiplexed into sample-specific bins by the MSC software. Each sample specific fastq file was analysed in the following manner:
i) adaptor trimming of both Read 1 and Read 2 using cutadapt; ii) H6 trimming of the start of Read 1 to remove the degenerate priming site arising from the adaption of fragment with the universal hairpin adaptor; iii) paired alignment of the trimmed reads to the E coli K-12 MG1655 reference genome (accession number NC_000913.3) using the Bowtie2 aligner (options: bowtie2 -p 4 —score-min L,0,-0.2 -ignore-quals —no-mixed —no-discordant — maxins 500) to generate indexed binary alignment files (.bam and .bai files) and iv) postprocessing analysis of targeted enrichment using the picard toolkit (Broad Institute) utilizing an interval file window size of 600 bp. Key alignment metrics from two successful exemplifications of the method, namely CEG28_25_6 and CEG28_25_8 are shown in Table 4. Visualisation of the binary alignment files was performed using either SeqMonk analysis tool (www.bioinformatics.babraham.ac.uk/projects/seqmonk) or IGV genome browser (www, broadinstitute. org/i gv) . Table 4: Summary of targeting metrics
Figure imgf000042_0001
Observations:
The summary metrics given in Table 4 show that the method described generates high quality targeted enrichment data with very high specificity (>98 % on target reads, <2 % off target reads) and high levels of targetome enrichment (>119-fold enrichment over an un-enriched genomic sample). Figure 1 is an IVG screenshot showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions. The bottom track shown in blue indicates the expected loci primer annealing positions.
EXAMPLE 2: Preparation of loci targeting libraries from native E.coli genomic DNA without post-hybridisation capture and cleanup prior to PCR amplification
Step 1 : Probe design
A pool of 2000 ssDNA probes were designed to be complementary to the top and bottom strands of native E.coli genome (MG1655). Probes were 64-84 nucleotides long with 30-50 nucleotides being complementary to the target loci distributed randomly in the E.coli genome. All probes had Tm between 55 °C ± 5 °C and were labelled with Biotin at the 5' end. The final probes include a 5' adaptor sequence to enable compatibility with Illumina sequencing technology (5'-GTGACTGGAGTTCAGACGTGTGCTCTTCCTATCT).
Step 2: DNA fragmentation Native E.coli genomic DNA (strain K-12 sub-strain MG1655) was sheared to the length of approximately 350 bp using E-220 Covaris sonicator following manufacturers instructions.
Step 3 : End repair
Fragmented DNA (500 - 1000 ng) was heat denatured at 95 °C for 3 minutes, and snap cooled on ice. The DNA ends were repaired with 20 units of T4 Polynucleotide Kinase (PNK, Enzymatics Y9040L) in 20 of lx Addition Buffer (lOOmM Tris-acetate, 1.25 mM CoAc2, 0.125 mg/mL BSA, pH 6.6). The reaction was incubated at 37 °C for 20 minutes, and then heat denatured at 95 °C for 3 minutes.
Step 4: Incorporation of universal short hairpin adaptor
The triphosphate hairpin adaptor was prepared as follows: the universal short hairpin oligo (1250 pmol, Biomers GMBH) with a 5'-DBCO modification was reacted with an azido-3'- deoxyadenosine-5 '-triphosphate (1000 pmol, JenaBiosciences) for 2 hours at 10 °C in 10 mM Tris-HCl (pH 7.0). The sequence of universal short hairpin adaptor is listed in Table 1. An aliquot of this hairpin adaptor (40 pmol) and 20 units of Terminal Transferase (TdT, Enzymatics, P7070L) were then added to incorporate the hairpin adaptor to the 3 '-end of the DNA (37 °C for 30 minutes). Reaction was stopped with 50 mM EDTA.
Table 4: Universal Short Hairpin Adaptor sequence
Figure imgf000043_0001
Step 5 : Magnetic bead purification
The products obtained were SPRI bead purified with 3 : 1 bead solution:DNA ratio (18 % PEG-8000, 1 M NaCl, 1 mM EDTA, 10 mM Tris-HCl (pH 8.0), and 0.1 % w/v Carboxy magnetic beads). Binding times were 10 minutes and washes performed with 80 % ethanol. DNA was eluted from the beads with ultra pure water.
Step 6: Complementary strand extension The tagged DNA eluted from step 5 was mixed with the PNK/Klenow cocktail (10 units PNK, 5 units Klenow exo- (P7010L)), 30 nmol dNTP mix in lx Blue buffer (50 mM NaCl, 10 mM Tris-HCl, 10 mM MgC12, 1 mM DTT pH 7.9) to dephosphorylate the 3' end of the attached hairpin adaptor, and to synthesis the full length, complementary bottom strand by extension from the annealed hairpin adaptor by Klenow (37 °C for 30 minutes, then 55 °C for 20 minutes). The extended products were further SPRI bead purified with 2: 1 bead solution:DNA ratio. DNA was eluted from the beads with ultra pure water.
Step 7. UDG digestion
The extended DNA was incubated with Thermolabile UDG (2.5 units, Enzymatics, G5020L) at 37 °C for 30 min to remove original strand and linearize the hairpin in lx Reaction Buffer (70 mM Tris-HCl, 10 mM NaCl, 1 mM EDTA, 0.1 mg/mL BSA pH 8.0 @25 °C), and then heat denatured at 95 °C for 5 minutes. DNA was purified using 3x SPRI beads as described in step 5, eluted from the beads with 10 μΤ, ultra pure water and quantified by Qubit ssDNA assay kit. The linearized DNA was used for probe hybridization capture and linear amplification.
Step 8: Probe hybridization and linear extension
For each reaction (see Table 5 for details), 190 ng (0.127 fmol) of E.coli DNA and 2.16 pmol probe pool were thermocycled on BioRad C-100 Touch thermo cycler with KAPA2G Robust HotStart DNA polymerase in lx KAPA2G Buffer A (KapaBiosystems, KK5518), supplemented with 250 μΜ each dNTP. Linear extension conditions: initial denaturation at 95 °C for 1 min, 10 or 20 cycles of 95 °C for 15 sec, 48/55 °C for 5 min, and 72 °C for 30 sec, final extension at 72 °C for 1 min.
Step 9: PCR on extended probes
Fully extended probes were amplified using KAPA2G Robust HotStart DNA polymerase, in lx KAPA2G Buffer A with 200 uM each dNTP. PCR primers used were NEBNext Multiplex Oligos for Illumina (500 nM of Indexed primers and 1 uM of Universal primer, New England Biolabs, E7335S, listed in Table 6). Table 5 : Experimental conditions for processed samples
Figure imgf000045_0001
Table 6: Sequences of the NEBNext Multiplex PCR primers for Illumina
Oligo Sequence (5' to 3')
NEBNext Index 1 CAAGCAGAAGACGGCATACGAGATCGTGATGTG Primer for Illumina ACTGGAGTTCAGACGTGTGCTCTTCCGATC*T
NEBNext Index 2 CAAGCAGAAGACGGCATACGAGATACATCGGTG Primer for Illumina ACTGGAGTTCAGACGTGTGCTCTTCCGATC*T-
NEBNext Index 3 CAAGCAGAAGACGGCATACGAGATGCCTAAGTG Primer for Illumina ACTGGAGTTCAGACGTGTGCTCTTCCGATC*T
NEBNext Index 4 CAAGCAGAAGACGGCATACGAGATTGGTCAGTG Primer for Illumina ACTGGAGTTCAGACGTGTGCTCTTCCGATC*T
NEBNext Index 5 CAAGCAGAAGACGGCATACGAGATCACTGTGTGA Primer for Illumina CTGGAGTTCAGACGTGTGCTCTTCCGATC*T
NEBNext Index 6 CAAGCAGAAGACGGCATACGAGATATTGGCGTG Primer for Illumina ACTGGAGTTCAGACGTGTGCTCTTCCGATC*T-
NEBNext Index 7 CAAGCAGAAGACGGCATACGAGATGATCTGGTG Primer for Illumina ACTGGAGTTCAGACGTGTGCTCTTCCGATC*T
NEBNext Index 8 CAAGCAGAAGACGGCATACGAGATTCAAGTGTGA Primer for Illumina CTGGAGTTCAGACGTGTGCTCTTCCGATC*T
NEBNext Universal AATGATACGGCGACCACCGAGATCTACACTCTTTC PCR Primer for CCTACACGACGCTCTTCCGATC*T
Illumina * = backbone phosphorothioate modification Step 10: Magnetic bead purification
The PCR products obtained were purified with lx 18 % PEG Ampure XP beads according to manufacturers instructions. Samples were eluted from the beads in ultra pure water.
Step 1 1 : Next generation sequencing and data processing
Samples were pooled, denatured and loaded onto an Illumina MiSeq flowcell at 10 pM concentration. The samples were paired-end sequenced (2x 75cycle) using MiSeq v3 SBS chemistry according to the manufacturers standard protocol. Raw read fastq files were automatically demultiplexed into sample-specific bins by the MSC software. Each sample specific fastq file was analysed in the following manner:
i) adaptor trimming of both Read 1 and Read 2 using cutadapt; ii) H6 trimming of the start of Read 1 to remove the degenerate priming site arising from the adaption of fragment with the universal hairpin adaptor; iii) paired alignment of the trimmed reads to the E coli K-12 MG1655 reference genome (accession number NC_000913.3) using the Bowtie2 aligner (options: bowtie2 -p 4 — score-min L,0,-0.2 -ignore-quals —no-mixed —no-discordant — maxins 500) to generate indexed binary alignment files (.bam and .bai files) and iv) postprocessing analysis of targeted enrichment using the picard toolkit (Broad Institute) utilizing an interval file window size of 600 bp. Key alignment metrics from two successful exemplifications of the method, namely CEG27_75_1 to CEG27_75_8 are shown in Table 6. Visualisation of the binary alignment files was performed using either SeqMonk analysis tool (www.bioinformatics.babraham.ac.uk projects/seqmonk) or IGV genome browser (www.broadinstitute.org/igv).
Figure imgf000047_0001
Table 6: Summary of targeting metrics Observations:
The summary metrics given in Table 6 show that the method described generates high quality targeted enrichment data with very high specificity (>92 % on target reads, <8 % off target reads) and high levels of targetome enrichment (> 18-fold enrichment over an un-enriched genomic sample). Figures 2 and 3 are IVG screenshots showing read alignment and read coverage from a 24 kb region of E coli (coords 1,907,892-1,932,808) clearly demonstrating the enriched regions (stacks of overlaying reads within a defined region) and the low level of non-specific noise in the intervening regions. The bottom track shown in blue indicates the expected loci primer annealing positions.
Example 3: Use of pre-amplification to increase the amount of available material. Probe design and plex:
The panel of probes used is a high complexity pool, typically between 12,000 and 440,000 plex. Each probe in the pool is a single stranded 86mer, composed of a 36nt "5 ' tail" of universal sequence compatible with the Illumina NGS sequencing technology platform and a 50nt "3 ' head" designed to complement target regions in the human genome (designed to hg38 build). The head region of each probe was designed to be the identity of the bisulfite converted original sequence.
DNA Sample used for example 3:
Promega human male gDNA (catalogue number: G1471).
CEGX TrueMethyl OnTarget PROTOCOL
This protocol describes the process of performing targeted enrichment using the CEGX TrueMethyl OnTarget technology using commercially available reagents as labeled according to the CEGX commercial kits.
The starting point for this protocol assumes that a human genomic DNA sample has been prepared in the following way:
• Genomic DNA pre-sheared to 800bp
• Sheared gDNA processed through the CEGX TrueMethyl conversion kit
• Purified, converted ssDNA is the starting point for the following steps of the protocol. LIBRARY MODULE PROTOCOL
NOTE: CEGX recommends that steps 7, 8 and 9 (incubations and purifications) of the Library Module protocol are to be done in 1.5 mL tubes to minimise sample loss. We recommend all incubations for Steps 7, 8 and 9 are to be done in heat blocks capable of holding 1.5 mL tubes.
STEP 7: END ACTIVATION
IMPORTANT: Prepare a FRESH stock of 70% Ethanol for the experiment as follows:
For each sample, 1.6 mL of 70% Ethanol will be required. Make the appropriate volume of 70% Ethanol (e.g. To make 2 mL of 70%
Ethanol combine 1.4 mL 100% of Ethanol and 0.6 mL Ultra Pure
Water from the TrueMethyl® WG kit). Mix by vortexing or inversion.
Step 7a: Reaction Volumes
Volumes
Converted DNA sample 11 μΐ.
Buffer 1 2
Enzyme A 2 uL
Total 15 μΐ. IMPORTANT: Buffer 1 and Enzyme A should not be master mixed prior to use.
Procedure
Step 7.1. Thaw Buffer 1 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
Step 7.2. Add 2 μΐ of Buffer 1 to the 11 μΐ. of DNA (from Step 6.16) recovered from the Conversion Module, mix by pipetting.
Step 7.3. Add 2 μΐ^ of Enzyme A to the DNA (from Step 7.2.) and mix by pipetting. Step 7.4. Incubate DNA at 37 °C for 20 min.
Step 7.5. Heat denature the reaction mix at 95 °C for 3 min.
Step 7.6. Cool DNA immediately on ice for 5 min.
Step 7b: Reaction Volumes
Volumes
DNA (Step 7.6.) 15 μL·
Adaptor 3 2 μί^
Adaptor 1 Additive 1 μΐ^
Enzyme B 2 u_L
Total 20 ΐ,
IMPORTANT: The Adaptor 3 aliquot is intended as a single use only. Repeated freeze/thaw cycles should be avoided. Adaptor 3, Adaptor 1 Additive and Enzyme B should not be master mixed prior to use.
Procedure
Step 7.7. Thaw Adaptor 3 and Adaptor 1 Additive on ice.
Step 7.8. To the 15 μΐ. of DNA (from Step 7.6.), add 2 μΐ, of Adaptor 3, and 1 μΐ, of
Adaptor 1 Additive, mix by pipetting.
Step 7.9. Add 2 μΐ^ of Enzyme B and mix by pipetting.
Step 7.10. Incubate at 37 °C for 30 min.
Step 7c: Bead Purification
Volumes DNA (Step 7.10.) 20 μΐ.
Stop Solution 2 μΐ^
Magnetic Bead Binding Solution 3 66 μΐ^
Total 88
Procedure
Step 7.1 1. To the 20 μΐ, of DNA (from Step 7.10.), add 2 μΐ. of the Stop Solution.
Step 7.12. Add 66 μΙ_, of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 7.13. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 7.14. Carefully remove supernatant and wash the beads twice with 200 μΐ^ of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step 7d: Elution
Volumes
Resuspend beads in Ultra Pure Water 23.5 μΐ^
Collect DNA 22 μΐ,
Total 22 ΐ,
Procedure
Step 7.15. Resuspend beads in 23.5 μΐ^ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
Step 7.16. Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 22 μΐ^ of DNA in a new 1.5 rtiL tube.
SAFE STOPPING POINT: The purified DNA can be stored at -20°C overnig
STEP 8: STRAND SYNTHESIS
Step 8a: Reaction Volumes
Volumes
DNA (Step 7.16.) 22 μΐ,
Buffer 2 6 iL Enzyme A 1 μΕ Enzyme C 1 uL Total 30
IMPORTANT: Enzyme A and Enzyme C may be master mixed before addition.
Procedure
Step 8.1. Thaw Buffer 2 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
Step 8.2. Add 6 μL· of the Buffer 2 to the 22 μΐ. of DNA (from Step 7.16 ), mix by pipetting.
Step 8.3. Add 1 μί^ of Enzyme A and 1 μί^ of Enzyme C to the buffered DNA (from
Step 8.2.).
Step 8.4. Incubate DNA at 37 °C for 30 min.
Step 8b: Bead Purification
Volumes
DNA (Step 8.4.) 30 μΐ,
Stop Solution 3 μΐ.
Magnetic Bead Binding Solution 3 66 μΕ
Total 99 μΐ,
Procedure
Step 8.5. To the 30 μΐ, of DNA (from Step 8.4.), add 3 μΐ, of the Stop Solution.
Step 8.6. Add 66 μΐ. of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 8.7. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 8.8. Carefully remove supernatant and wash the beads twice with 200 μΐ. of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step 8c: Elution
Volumes Resuspend beads in Ultra Pure Water 19.5 μΐ,
Collect DNA 18 μΐ,
Total 18 ΐ.
Procedure
Step 8.9. Resuspend beads in 19.5 of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
Step 8.10. Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 18 μί^ of DNA in a new 1.5 mL tube.
SAFE STOPPING POINT: The purified DNA can be stored at -20°C overnig
STEP 9: LIBRARY FINISHING
Step 9a: Reaction Volumes
Volumes
DNA (Step 8.10.) 18 μΐ,
Buffer 3 22.5 μΐ
Adaptor 4 3.5 μί^
Enzyme D I μΐ^
Total 45 μΐ.
IMPORTANT: Buffer 3 and Adaptor 4 may be master mixed before addition.
Thaw Buffer 3 and Adaptor 4 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
Add 22.5 μΙ_, of the Buffer 3 and 3.5 μL· of Adaptor 4 master mix to the 18 μΙ_, of DNA (from Step 8.10.), mix by pipetting.
Add 1 μΐ. of Enzyme D to the buffered DNA (from Step 9.2) and mix by pipetting.
Incubate DNA at 25 °C for 15 min.
Step 9b: Bead Purification
Volumes
DNA (Step 9.4.) 45 μί Stop Solution 5 μΐ.
Magnetic Bead Binding Solution 3 50 uL
Total 100 μΐ.
Procedure
Step 9.5. To the 45 μΐ, of DNA (from Step 9.4.), add 5 μΐ, of the Stop Solution.
Step 9.6. Add 50 μΐ^ of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 9.7. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 9.8. Carefully remove supernatant and wash the beads twice with 200 μΐ. of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step 9c: Elution
Volumes
Resuspend beads in Ultra Pure Water 20 μί
Collect DNA 18.75 μΕ
Total 18.75 μΐ,
Procedure
Step 9.9. Resuspend beads in 20.0 μΐ^ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
Step 9.10. Precipitate beads using a magnetic rack for 2 min at room temperature, and transfer 18.75 μΐ. of DNA to a new 0.2 mL PCR tube.
SAFE STOPPING POINT: The purified DNA can be stored at -20°C overnight.
STEP 10: PRE-TARGETING LIBRARY AMPLIFICAION
Step 10a: Reaction Volumes
Volumes
DNA (Step 9.10.) 18.75 μL·
Buffer 4 5 μΕ
Enzyme E 1.25 μΕ
Total 25 μί IMPORTANT: Buffer 4 and Enzyme E should not be master mixed prior to
Procedure
Step 10.1. Thaw Buffer 4 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
Step 10.2. Add 5 μΐ. of the Buffer 4 to the 18.75 μL· of DNA (from Step 9.10.) and mix by pipetting.
Step 10.3. Add 1.25 Ε of Enzyme E to the buffered DNA (from Step 10.2.) and mix by pipetting
Step 10.4. Incubate DNA at 37 °C for 30 min.
Step 10.5. Heat denature Enzyme E for 95 °C for 5 min.
Step 10.6. Cool DNA to 4 °C for 5 min.
Step 10b: Sample Amplification
Volumes
DNA (Step 10.6.) 12
OT PCR Mix 37
1 _uL
Total 50
Procedure
Step 10.7. Thaw the OT PCR Mix on ice.
Step 10.8. Add 37 Ε of the OT PCR Mix to the 12 μΕ of DNA (from Step 10.6 ).
Step 10.9. Add 1 μΐ. of Enzyme F to the buffered DNA (from Step 10.8.).
Step 10.10. Thermocycle indexing reaction as described below for 13 cycles.
Thermocycle (13 Cycles)
Step Time Temperature
Initial Denaturation 180 s 95 °C
Denaturation 20 s 98 °C
Annealing 20 s 60 °C I 13 Cycles Elongation 60 s 72 °C
Final Extension 300 s 72 °C
Step 10c: Bead Purification
Volumes
DNA (Step 10.10.) 50
Stop Solution 5 μL·
Magnetic Bead Binding Solution 3 49.5 μΕ
Total 104.5
Procedure
Step 10.1 1. To the 50 μΐ, of DNA (from Step 10.10.), add 5 μΐ, of Stop Solution and 49.5 of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 10.13. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 10.14. Carefully remove supernatant and wash the beads twice with 200 μΐ^ of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step lOd: Elution
Volumes
Resuspend beads in Ultra Pure Water 10 μΐ,
Collect DNA 9
Total μί
Procedure
Step 10.15. Resuspend the beads in 10 μΐ^ of Ultra Pure Water and incubate for 2 min at room temperature.
Step 10.16. Centrifuge briefly and precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 μΐ^ of DNA into a new tube.
Step lOe: Quantification
Quantify eluted DNA using HS Qubit kit for dsDNA. SAFE STOPPING POINT The purified DNA can be stored at -20°C overnig
TARGETING MODULE PROTOCOL STEP 11 : TARGETING
Step 11a: Hybridisation
Volumes
400 ng DNA (Step 10.16) up to 7
Hybridisation Buffer 12.5 μΐ, Hybridisation additive 3 μL· Probes 2.5
Ultra Pure Water (up to 25) μΐ,
Total 25 μί
Procedure
Step 11.1. Pre-warm Hybridisation Buffer (HB) to 50 °C in a thermal block.
Step 11.2. Mix 400 ng DNA (from Step 10.16.) with Ultra Pure Water to make up 7
Add 3 μΐ, of Hybridisation Additive (HA) to the DNA, denature at 95 °C for 3 minutes and then hold at 50 °C in a thermal cycler.
Step 11.3. Add pre-warmed HB to the DNA/HA mix, mix by pipetting, and hold at 50 °C for 5 min.
Step 11.4. Aliquots 2.5 μΙ_, of probe to individual tube and pre-warm to 50 °C (set heated lid temperature at 70 °C if possible).
Step 11.5. Transfer the mix from Step 11.3. to the pre-warmed probe tube (Step 11.4.), mix by pipetting.
Step 11.5. Incubate the hybridization reactions at 50 °C for approximately 64h in a PCR block.
Step 1 lb: Hybridisation clean-up bead pre-wash
Procedure
Step 11.6. Use 15 μΕ M-280 Streptavidin Beads (ThermoFisher, 11205D; user supplied) per reaction, in fresh 1.5 mL tubes.
Step 11.7. Wash M-280 Streptavidin Beads three times with 500 μΐ, of lx BW buffer
(first wash for 5 min with mixing). Step 1 1.8. Resuspend the beads in 35 μΐ. of 1 2x BW buffer.
Step 1 1 c: Hybridisation clean-up
Step 1 1.9. Transfer 25 μΐ^ of the hybridization mixture to the beads (Total volume 60 μΕ) and incubate at room temperature for 30 minutes in a thermal mixer (at 1000 rpm).
Step 1 1.10. Perform 2 washes with 200 of Wash Buffer 1 (WB 1) at room temperature,
5 min each time in a thermal mixer (at 1000 rpm). Centrifuge briefly and precipitate beads using a magnetic rack for 1 min after each wash, carefully remove the supernatant.
Step 1 1.1 1. Perform 3 washes with 200 μL· of Wash Buffer 2 (WB2) at 50 °C, 5 min each time in a thermal mixer (at 1000 rpm). Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant.
Step 1 1.12. Perform final wash with 50 μί of Wash Buffer 3 (WB3) at room temperature.
Step 1 1.13. Remove the supernatant and resuspend the beads in 10.5 μΐ^ Ultra Pure Water.
Step l i d: Probe extension (Finishing Targeting)
Volumes
Purified DNA on beads (from Step 1 1.13) 10.5 ΐ,
Buffer 5 18 μΐ.
Enzyme C 1.5 μΐ^
Total 30 ΐ,
Procedure
Step 1 1.14. To the purified DNA on beads add 1 % μΐ, Buffer 5 and 1.5 μΐ^ Enzyme C. Step 1 1.15. Mix reactions gently and incubate for 30 minutes at 37 °C.
Step l i e: 2M Purification
Procedure
Step 1 1.16. Collect the beads from Step 1 1.15. on a magnetic rack and remove the supernatant.
Step 1 1.17. Wash the beads twice with lx BWT buffer.
Step 1 1.18. Resuspend the beads in 30 μΐ^ of BW4 and incubate for 10 minutes at room temperature in a thermal mixer. Step 1 1.19. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant.
Step 1 1.20. Wash the beads twice with 200 μΐ, οΐ λχ BWT buffer, then once with 50 μΐ, of
Ultra Pure Water.
Step 1 1.21. Resuspend the beads in 25 of Ultra Pure Water, and use 12.5 for the next step (store the other 12.5 at 4 °C).
STEP 12: SAMPLE INDEXING
Step 12a: Sample Indexing
Volumes
Beads suspension with DNA (Step 1 1.21.) 12.5 μΐ.
OT Index PCR Mix 12
Enzyme F 0.5
Total 25 μΐ.
Procedure
Step 12.1. Thaw the OT Index PCR Mix on ice (use index of choice, Appendix E).
Step 12.2. Transfer 12.5 μΕ of DNA (from Step 11.21.) to fresh 0.2 mL PCR tube add 12 μΐ, of the OT Index PCR Mix.
Step 12.3. Add 0.5 μί of Enzyme F to the buffered DNA (from Step 12.2.).
Step 12.4. Thermocycle indexing reaction as described below for 16 cycles.
Thermocycle (16 Cycles)
Step Time Temperature
Initial Denaturation 180 s 95 °C
Denaturation 20 s 98 °C
Annealing 20 s 60 °C 16 Cycles
Elongation 60 s 72 °C
Figure imgf000058_0001
Final Extension 300 s 72 °C
Hold Indefinite 4 °C
Step 12b: Bead Purification
Procedure Step 12.5. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min at room temperature.
Step 12.6. Carefully remove the supernatant from step 12.5. (this now contains the
DNA).
Volumes
DNA from the supernatant (Step 12.6.) 25
Magnetic Bead Binding Solution 3 22.5 _uL
Total 47.5 μΐ,
Procedure
Step 12.7. To the 25 μL· of DNA (from Step 12.6.), add 22.5 μL· of the Magnetic Bead
Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 12.8. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 12.9. Carefully remove supernatant and wash the beads twice with 200 μί, of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step 12c: Elution
Volumes
Resuspend beads in Elution Buffer 10 μΐ^
Collect DNA 9 μΕ
Total 9 μΐ.
Procedure
Step 12.10. Resuspend the beads in 10 μΙ_, of Ultra Pure Water and incubate for 2 min at room temperature.
Step 12.1 1. Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 μΐ. of DNA into a new tube. The DNA is now ready for Illumina NGS sequencing.
Note: CEGX recommends that the indexed libraries are kept at -20 °C for long term storage The library so prepared can be sequenced. A summary of the library preparation conditions is shown in Table 7
Table 7: Example 3 : Hybridisation conditions
Figure imgf000060_0001
A summary of the sequencing metrics is shown in Table 8:
Table 8: Example 3 Targeted enrichment sequencing metrics
Sample II) CEG36 139 7
Sample DNA Promega
Panel ID CEGXG003
No. regions in target 428101
No. positions in target 81730677
Readlength 150
Total reads 57291789
Total PF aligned reads 42980453
Total aligned PF bases 5906176736
Duplicate rate (%) 27.42
Percent mapping efficiency 75.03 Mean target coverage (X-fold) 19.39
Percent bases covered >2X 75.64
% regions with 1 CpG >=lX 90.06
% regions with 1 CpG >=10X 60.92
% bases on target 32.64
% aligned bases on target 65.78
CpG in panel design 1799376
Number CpG >=01X 1381345
Number CpG >=10X 780889
Number CpG >=30X 413543
Representative sequencing data is shown in Figures 5 and 6
Example 4: Alternative targeting extension conditions
Probe design and plex:
The panel of probes used is a high complexity pool, typically between 12,000 and 440,000 plex. Each probe in the pool is a single stranded 86mer, composed of a 36nt "5 ' tail" of universal sequence compatible with the Illumina NGS sequencing technology platform and a 50nt "3 ' head" designed to complement target regions in the human genome (designed to hg38 build). The head region of each probe was designed to be the identity of the bisulfite converted original sequence.
DNA Sample used for example 4:
Promega human male gDNA (catalogue number: G1471).
CEGX TrueMethyl OnTarget PROTOCOL
This protocol describes the process of performing targeted enrichment using the CEGX TrueMethyl OnTarget technology using commercially available reagents as labeled according to the CEGX commercial kits.
The starting point for this protocol assumes that a human genomic DNA sample has been prepared in the following way:
• Genomic DNA pre-sheared to 800bp
• Sheared gDNA processed through the CEGX TrueMethyl conversion kit • Purified, converted ssDNA is the starting point for the following steps of the protocol. LIBRARY MODULE PROTOCOL
NOTE: CEGX recommends that steps 7, 8 and 9 (incubations and purifications) of the Library Module protocol are to be done in 1.5 mL tubes to minimise sample loss. We recommend all incubations for Steps 7, 8 and 9 are to be done in heat blocks capable of holding 1.5 mL tubes.
STEP 7: END ACTIVATION
IMPORTANT: Prepare a FRESH stock of 70% Ethanol for the experiment as follows:
For each sample, 1.6 mL of 70% Ethanol will be required. Make the appropriate volume of 70% Ethanol (e.g. To make 2 mL of 70%
Ethanol combine 1.4 mL 100% of Ethanol and 0.6 mL Ultra Pure
Water from the TrueMethyl® WG kit). Mix by vortexing or inversion.
Step 7a: Reaction Volumes
Volumes
Converted DNA sample 11 μΐ.
Buffer 1 2
Enzyme A 2 uL
Total 15 μΐ.
IMPORTANT: Buffer 1 and Enzyme A should not be master mixed prior to use
Procedure
Step 7.1. Thaw Buffer 1 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
Step 7.2. Add 2 μΐ, of Buffer 1 to the 11 μΐ, of DNA (from Step 6.16) recovered from the Conversion Module, mix by pipetting.
Step 7.3. Add 2 of Enzyme A to the DNA (from Step 7.2.) and mix by pipetting. Step 7.4. Incubate DNA at 37 °C for 20 min.
Step 7.5. Heat denature the reaction mix at 95 °C for 3 min.
Step 7.6. Cool DNA immediately on ice for 5 min.
Step 7b: Reaction Volumes
Volumes DNA (Step 7.6.) 15 μΐ.
TM OnTarget Adaptorl 2 ΐ.
Adaptor 1 Additive 1 ΐ^
Enzyme B 2 u_L
Total 20 μΐ,
IMPORTANT: The Adaptor 3 aliquot is intended as a single use only. Repeated freeze/thaw cycles should be avoided. Adaptor 3, Adaptor 1 Additive and Enzyme B should not be master mixed prior to use.
Procedure
Step 7.7. Thaw Adaptor 3 and Adaptor 1 Additive on ice.
Step 7.8. To the 15 μΕ of DNA (from Step 7.6.), add 2 μΐ. of Adaptor 3, and 1 μΐ. of
Adaptor 1 Additive, mix by pipetting.
Step 7.9. Add 2 μΐ^ of Enzyme B and mix by pipetting.
Step 7.10. Incubate at 37 °C for 30 min.
Step 7c: Bead Purification
Volumes
DNA (Step 7.10.) 20 μΐ.
Stop Solution 2 μΐ^
Magnetic Bead Binding Solution 3 66 μΕ
Total 88 μΐ.
Procedure
Step 7.11. To the 20 μΐ, of DNA (from Step 7.10.), add 2 μΐ, of the Stop Solution.
Step 7.12. Add 66 μΐ. of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 7.13. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 7.14. Carefully remove supernatant and wash the beads twice with 200 μΕ of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min). Step 7d: Elution
Volumes
Resuspend beads in Ultra Pure Water 23.5
Collect DNA 22 μΐ,
Total 22 ΐ.
Procedure
Step 7.15. Resuspend beads in 23.5 μί^ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
Step 7.16. Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 22 μί^ of DNA in a new 1.5 mL tube.
SAFE STOPPING POINT: The purified DNA can be stored at -20°C overnight.
STEP 8: STRAND SYNTHESIS
Step 8a: Reaction Volumes
Volumes
DNA (Step 7.16.) 22 μΕ
Buffer 2 6 μΕ
Enzyme A 1 μΐ^
Enzyme C 1 μ_Ε
Total 30 μΕ
IMPORTANT: Enzyme A and Enzyme C may be master mixed before addition.
Procedure
Step 8.1. Thaw Buffer 2 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
Step 8.2. Add 6 μΐ, of the Buffer 2 to the 22 μΐ, of DNA (from Step 7.16 ), mix by pipetting.
Step 8.3. Add 1 μL of Enzyme A and 1 μL of Enzyme C to the buffered DNA (from
Step 8.2.).
Step 8.4. Incubate DNA at 37 °C for 30 min. Step 8b: Bead Purification
Volumes
DNA (Step 8.4.) 30 μΐ.
Stop Solution 3 μΐ^
Magnetic Bead Binding Solution 3 66 μΐ^
Total 99
To the 30 μΐ. of DNA (from Step 8.4.), add 3 μΐ. of the Stop Solution.
Add 66 μΐ. of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Carefully remove supernatant and wash the beads twice with 200 μΐ^ of 70% Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step 8c: Elution
Volumes
Resuspend beads in Ultra Pure Water 19.5 μΐ, Collect DNA 18
Total 1 ! μΐ.
Procedure
Step 8.9. Resuspend beads in 19.5 μΐ^ of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
Step 8.10. Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 18 μL of DNA in a new 1.5 rtiL tube.
SAFE STOPPING POINT: The purified DNA can be stored at -20°C overnig
STEP 9: LIBRARY FINISHING
Step 9a: Reaction Volumes
Volumes DNA (Step 8.10.) 18 μΐ.
Buffer 3 22.5 μΐ.
TM OnTarget Adaptor 2 3.5 μΐ.
Enzyme D i uL
Total 45 μΐ,
IMPORTANT: Buffer 3 and Adaptor 4 may be master mixed before addition.
Procedure
Step 9.1. Thaw Buffer 3 and Adaptor 4 on ice and equilibrate the Magnetic Bead
Binding Solution 3 to room temperature.
Step 9.2. Add 22.5 ΐ^ of the Buffer 3 and 3.5 μί^ of Adaptor 4 master mix to the 18 μΐ^ of DNA (from Step 8.10.), mix by pipetting.
Step 9.3. Add 1 μΐ^ of Enzyme D to the buffered DNA (from Step 9.2) and mix by pipetting.
Step 9.4. Incubate DNA at 25 °C for 15 min.
Step 9b: Bead Purification
Volumes
DNA (Step 9.4.) 45 μΐ.
Stop Solution 5 μΐ,
Magnetic Bead Binding Solution 3 50 _uL
Total 100 μΕ
Procedure
Step 9.5. To the 45 μΐ. of DNA (from Step 9.4.), add 5 μΕ of the Stop Solution.
Step 9.6. Add 50 μΐ. of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 9.7. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 9.8. Carefully remove supernatant and wash the beads twice with 200 μΐ. of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min). Step 9c: Elution
Volumes
Resuspend beads in Ultra Pure Water 20 μΐ.
Collect DNA 18.75 μΐ,
Total 18.75 ΐ.
Procedure
Step 9.9. Resuspend beads in 20.0 of Ultra Pure Water by vortexing, centrifuge briefly and incubate for 2 min at room temperature.
Step 9.10. Precipitate beads using a magnetic rack for 2 min at room temperature, and transfer 18.75 μΐ. of DNA to a new 0.2 mL PCR tube.
SAFE STOPPING POINT: The purified DNA can be stored at -20°C overnight.
STEP 10: PRE-TARGETING LIBRARY AMPLIFICAION
Step 10a: Reaction Volumes
Volumes
DNA (Step 9.10.) 18.75 μΕ
Buffer 4 5 μΕ
Enzyme E 1.25 μΕ
Total 25 μί
IMPORTANT: Buffer 4 and Enzyme E should not be master mixed prior to use.
Procedure
Step 10.1. Thaw Buffer 4 on ice and equilibrate the Magnetic Bead Binding Solution 3 to room temperature.
Step 10.2. Add 5 μΐ. of the Buffer 4 to the 18.75 μΕ of DNA (from Step 9.10.) and mix by pipetting.
Step 10.3. Add 1.25 μΐ. of Enzyme E to the buffered DNA (from Step 10.2.) and mix by pipetting
Step 10.4. Incubate DNA at 37 °C for 30 min.
Step 10.5. Heat denature Enzyme E for 95 °C for 5 min. Step 10.6. Cool DNA to 4 °C for 5 min.
Step 10b: Sample Amplification
Volumes
DNA (Step 10.6.) 25 μΐ.
Enrichment Premix 24 μΐ.
Enzyme F 1 _uL
Total 50 μΐ.
Procedure
Step 10.7. Thaw the OT PCR Mix on ice.
Step 10.8. Add 37 μΐ of the OT PCR Mix to the 12 |iL of DNA (from Step 10.6 ). Step 10.9. Add 1 iL of Enzyme F to the buffered DNA (from Step 10.8.).
Step 10.10. Thermocycle indexing reaction as described below for 13 cycles.
Thermocycle (13 Cycles)
Step Time Temperature
Initial Denaturation 180 s 95 °C
Denaturation 20 98 °C
Annealing 20 60 °C L 12 Cycles
Elongation 60 72 °C
Final Extension 300 s 72 °C
Step 10c: Bead Purification
Volumes
DNA (Step 10.10.) 50
Stop Solution 5
Magnetic Bead Binding Solution 3 44 L
Total 99
Procedure Step 10.11. To the 50 μΐ. of DNA (from Step 10.10.), add 5 μΐ. of Stop Solution and 44 μΐ^ of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 10.13. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 10.14. Carefully remove supernatant and wash the beads twice with 200 μΐ^ of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step lOd: Elution
Volumes
Resuspend beads in Ultra Pure Water 10 μΐ^
Collect DNA 9 μΕ
Total 9 μΐ.
Procedure
Step 10.15. Resuspend the beads in 10 μΐ^ of Ultra Pure Water and incubate for 2 min at room temperature.
Step 10.16. Centrifuge briefly and precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 μΐ. of DNA into a new tube.
Step lOe: Quantification
Quantify eluted DNA using HS Qubit kit for dsDNA.
SAFE STOPPING POINT: The purified DNA can be stored at -20°C overnight.
TARGETING MODULE PROTOCOL
STEP 11 : TARGETING
Step 11a: Hybridisation
Volumes
400 ng DNA (Step 10.16) up to 8 μΐ,
Hybridisation Buffer 12.5 μί
Hybridisation additive 2 μΕ
Probes 2.5 μΐ, Total 25 μΐ
Procedure
Step 1 1.1. Pre-warm Hybridisation Buffer (HB) to 50 °C in a thermal block.
Step 1 1.2. Mix 400 ng DNA (from Step 10.16.) with Ultra Pure Water to make up 8 μL·
Step 1 1.3. Add 2 of Hybridisation Additive (HA) and 12.5 μΐ^ of Hybridization Buffer to 8 μΙ_. DNA (from Step 11.2), denature at 95 °C for 3 minutes and then hold at 50 °C in a thermal cycler.
Step 1 1.4. Aliquots 2.5 μΐ^ of probe to individual tube and pre-warm to 50 °C (set heated lid temperature at 70 °C if at all possible).
Step 1 1.5. Transfer the mix from Step 1 1.3. to the pre-warmed probe tube (Step 11.4.), mix by pipetting.
Step 1 1.5. Incubate the hybridization reactions at 50 °C for approximately 64h in a PCR block.
Step 1 lb: Hybridisation clean-up bead pre-wash
Procedure
Step 1 1.6. Use 15 μί M-280 Streptavidin Beads (ThermoFisher, 1 1205D; user supplied) per reaction, in fresh 1.5 mL tubes.
Step 1 1.7. Wash M-280 Streptavidin Beads three times with 500 μΐ, of lx BW buffer
(first wash for 5 min with mixing).
Step 1 1.8. Resuspend the beads in 35 μΙ_, of 1 2x BW buffer.
Step 1 1c: Hybridisation clean-up
Step 1 1.9. Transfer 25 μΐ^ of the hybridization mixture to the beads (Total volume 60 μΐ.) and incubate at room temperature for 30 minutes in a thermal mixer (at 1000 rpm).
Step 1 1.10. Perform 2 washes with 200 μί of Wash Buffer 1 (WB 1) at room temperature,
5 min each time in a thermal mixer (at 1000 rpm). Centrifuge briefly and precipitate beads using a magnetic rack for 1 min after each wash, carefully remove the supernatant.
Step 1 1.1 1. Perform 3 washes with 200 μΐ, of Wash Buffer 2 (WB2) at 50 °C, 5 min each time in a thermal mixer (at 1000 rpm). Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant. Step 1 1.12. Perform final wash with 50 μΐ^ of Wash Buffer 3 (WB3) at room temperature, remove the supernatant.
Step l id: Probe extension (Finishing Targeting)
Volumes
Beads containing targeted DNA (from Step 1 1.13) - ΐ^
Extension Buffer 22.5 μΐ^
5xExtension Additive 6 ΐ^
Enzyme C 1.5 μΕ
Total 30 μί
5xExtension Additive:
2.5 M Betaine
6.5 mM DTT
Figure imgf000071_0001
Procedure
Step 1 1.14. Add 22.5 μΕ Extension Buffer, 6 μΤ, 5xExtension Additive and 1.5 μΕ
Enzyme C to the beads containing targeted DNA from step 11.12.
Step 1 1.15. Mix reactions gently and incubate for 30 minutes at 37 °C in a thermomixer at
1300 rpm.
Step l ie: 2M Purification
Procedure
Step 1 1.16. Collect the beads from Step 1 1.15. on a magnetic rack and remove the supernatant.
Step 1 1.17. Wash the beads twice with lx BWT buffer.
Step 1 1.18. Resuspend the beads in 30 μΐ^ of BW4 and incubate for 10 minutes at room temperature in a thermomixer at 1300 rpm.
Step 1 1.19. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min, carefully remove the supernatant.
Step 1 1.20. Wash the beads twice with 200 μΕ of lx BWT buffer, then once with 50 μΐ, of
Ultra Pure Water.
Step 1 1.21. Resuspend the beads in 25 μΐ^ of Resuspension Buffer and proceed to sample indexing. STEP 12: SAMPLE INDEXING
Step 12a: Sample Indexing
Volumes
Beads suspension with DNA (Step 11.21.) 25
Index PCR Mix 24 μΐ.
Enzyme F 1 iL
Total 50
Procedure
Step 12.1. Thaw the Index PCR Mix on ice (use index of choice, Appendix E).
Step 12.2. Transfer 25 μΐ, of DNA (from Step 11.21.) to fresh 0.2 mL PCR tube
24 μL of the Index PCR Mix.
Step 12.3. Add 1 μΐ. of Enzyme F to the buffered DNA (from Step 12.2.).
Step 12.4. Thermocycle indexing reaction as described below for 15 cycles.
Thermocycle (16 Cycles)
Step Time Temperature
Initial Denaturation 180 s 95 °C
Denaturation 30 98 °C
Annealing 30 55 °C 15 Cycles
Elongation 60 72 °C
Final Extension 300 s 72 °C
Figure imgf000072_0001
Hold Indefinite 4 °C
Step 12b: Bead Purification
Procedure
Step 12.5. Centrifuge briefly and precipitate beads using a magnetic rack for 1 min at room temperature.
Step 12.6. Carefully transfer the supernatant from step 12.5. to a fresh 1.5 ml tube (this now contains the DNA).
Volumes
DNA from the supernatant (Step 12.6.) 50 μΐ^
Stop Solution 5 μί Magnetic Bead Binding Solution 3 44 uL Total 99 μΐ,
Procedure
Step 12.7. To the 50 μΐ. of DNA (from Step 12.6.), add 5 μΐ. of the Stop Solution and 44 μΐ^ of the Magnetic Bead Binding Solution 3, vortex to mix and incubate for 15 min at room temperature.
Step 12.8. Centrifuge briefly and precipitate beads using a magnetic rack for 5 min at room temperature.
Step 12.9. Carefully remove supernatant and wash the beads twice with 200 μΐ^ of 70%
Ethanol. Leave the beads on the magnetic rack, with the lids open, until dry (5-15 min).
Step 12c: Elution
Volumes
Resuspend beads in Elution Buffer 10 μΐ^
Collect DNA 9 μΐ
Total 9 ΐ,
Procedure
Step 12.10. Resuspend the beads in 10 μΤ, of Ultra Pure Water and incubate for 2 min at room temperature.
Step 12.1 1. Precipitate beads using a magnetic rack for 2 min at room temperature, and collect 9 μΐ. of DNA into a new tube. The DNA is now ready for Illumina NGS sequencing.
Note: CEGX recommends that the indexed libraries are kept at -20 °C for long term storage
The library so prepared can be sequenced. A summary of the library preparation conditions is shown in Table 9.
Table 9: Example 4: Hybridisation conditions
CEG45 10 CEG45 10 CEG45 10 CEG45 10 6 8 9 11
Sample DNA Promega Promega Promega Promega
Extension Klenow Klenow CES BioTaq BioTaq CES
Conversion mass (ng) 200 200 200 200
Hvb mass fng) 400 400 400 400
Hvb time fhrs) 64 64 64 64
Panel ID CEGXP021 CEGXP021 CEGXP021 CEGXP021
G010 G010 G010 G010
Pin H6 H6 H6 H6
Probe Standard Standard Standard Standard
Probe mass fng) 25 25 25 25
Hvb temperature (°C) 50 50 50 50
Wash temperature (oC) 50 50 50 50
Enrichment PCR cvcles 14 14 14 14
Index PCR cvcles 15 15 15 15
A summary of the sequencing metrics is shown in Table 10:
10: Example 4 targeted enrichment sequencing metrics
CEG45 10 CEG45 10 CEG45 10 CEG45 10 6 8 9 11
Sample DNA Promega Promega Promega Promega
Panel ID CEGXP021 CEGXP021 CEGXP021 CEGXP021
G010 G010 G010 G010
No. regions in target 48000 48000 48000 48000
No. positions in target 9097298 9097298 9097298 9097298
Readlength 150 150 150 150
Total reads 8207384 8760982 6388562 6345246
Total PF aligned reads 5707923 6047689 4512534 4537425
Duplicate rate (%) 15.52 15.26 13.83 13.90
Percent mapping 69.55 69.03 70.65 71.53 efficiency
Mean target coverage (X- 23.91 24.56 20.69 21.10 fold)
Percent bases covered 96.57 96.84 95.89 96.15
>2X
% regions with 1 CpG 99.64 99.66 99.55 99.61
>=1X
% regions with 1 CpG 96.49 96.96 95.45 95.73
>=10X
% bases on target 37.31 35.85 39.20 40.44
% aligned bases on target 71.47 68.79 72.80 74.16
CpG in panel design 172219 172219 172219 172219
Number CpG >=01X 169009 169110 168418 168522
Number CpG >=10X 144935 146036 138315 139733
Number CpG >=30X 92923 94012 77912 79779
Representative sequencing data is shown in Figures 7 and 8

Claims

Claims:
1. A method of multiplex nucleic acid amplification comprising amplifying a selected subset of a population of nucleic acid fragments; the method comprising;
a. treating a nucleic acid sample with bisulfite to obtain a first population of single stranded nucleic acid molecules;
b. attaching a first oligonucleotide adaptor to the 3 ' end of each of the strands in the sample;
c. making the first population of single stranded molecules double stranded; d. either,
i) attaching a second adaptor to the double stranded molecules, amplifying the double stranded molecules and denaturing the double stranded molecules to produce a second population of single stranded molecules; or
ii) cleaving the first adaptor and removing the first population of single stranded nucleic acid molecules to produce a second population of single stranded molecules;
e. hybridising a population of locus specific primers to a subset of the second population of single stranded molecules;
f. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the first adaptor or a fragment thereof;
g. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps e and g occur simultaneously or sequentially; and
h. repeating steps e, f and g one or more times.
The method according to claim 1 comprising;
a. treating a nucleic acid sample with bisulfite to obtain a population of single stranded nucleic acid molecules,
b. attaching a first oligonucleotide adaptor to the 3 ' end of each of the strands in the sample,
c. making the population of single stranded molecules double stranded, d. attaching a second adaptor to the double stranded molecules,
e. amplifying the double stranded molecules, f. denaturing the double stranded molecules,
g. hybridising a population of locus specific primers to a subset of the denatured molecules,
h. extending the hybridised locus specific primers to produce extension products where each extension product includes a copy of the first adaptor; and i. amplifying the extension products.
The method according to claim 1 comprising;
a. treating a nucleic acid sample with bisulfite to obtaina population of single stranded nucleic acid molecules;
b. attaching an oligonucleotide adaptor to the 3' end of each of the strands in the sample;
c. hybridising a population of locus specific primers to a subset of the fragments or a copy thereof;
d. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the adaptor;
e. hybridising a single primer to the copy of the adaptor and extending each of the extension products using the single primer, where steps d and e occur simultaneously or sequentially; and
f. repeating steps d and e one or more times.
A method according to claim 3; the method comprising;
a. fragmenting the sample to produce a population of single stranded sample fragments;
b. attaching an oligonucleotide adaptor to the 3' end of each of the strands in the sample wherein the adaptor is a hairpin containing a cleavage site;
c. extending the 3 ' end of the hairpin to make the fragments double stranded; d. cleaving the hairpin to produce single stranded copies of the fragments having an adaptor fragment at one end, where the adaptor fragment is a portion of the hairpin;
e. hybridising a population of locus specific primers to a subset of the fragment copies; f. extending the hybridised locus specific primers to produce extension products which are a subset of the fragments where each extension product includes a copy of the adaptor fragment;
g. hybridising a single primer to the copy of the adaptor fragments and extending each of the extension products using the single primer, where steps f and g occur simultaneously or sequentially; and
h. repeating steps f and g one or more times.
5. The method according to any one preceding claim wherein the population of locus specific primers is at least 50 different sequences.
6. The method according to any one preceding claim wherein each locus specific primer has a universal region common to all locus specific primers.
7. The method according to claim 6 wherein the universal region common to all locus specific primers and the second adaptor are of different sequence and are non- complementary.
8. The method according to any one preceding claim wherein the adaptor contains one or more uracil bases.
9. The method according to any one of claims 1 to 8 wherein the adaptor is attached using a ligase.
10. The method according to any one of claims 1 to 8 wherein the adaptor is attached using a polymerase.
11. The method according to claim 10 wherein the polymerase is a template independent polymerase.
12. The method according to claim 11 wherein the template independent polymerase is terminal transferase (TdT).
13. The method according to any one of any one of claims 10 to 12 wherein the adaptor comprises a triphosphate moiety.
14. The method according to any one preceding claim wherein the extension products are denatured after the extension and copies thereof produced.
15. The method according to any one preceding claim wherein sequencing is carried out on the amplified mixture.
16. The method according to any one preceding claim wherein the sample is genomic DNA.
17. The method according to any one preceding claim wherein the sample is RNA.
18. The method according to any one preceding claim wherein the sample is cell free nucleic acids.
19. The method according to any one preceding claim wherein the sample is both DNA and RNA.
20. A kit for use in selecting fragments from a nucleic acid sample, the kit comprising a plurality of locus specific primers, a hairpin adaptor and a primer complementary to a portion of the hairpin adaptor or a copy thereof.
21. The kit according to claim 20 wherein the hairpin polynucleotide has a triphosphate moiety at the 5' -end.
22. The kit according to claims 20 or 21 wherein each member of the population of locus specific primers contains a common universal sequence and one of a plurality of locus-specific regions wherein each locus specific region contains only the nucleic acid bases A, G and T such that the locus is complementary to the copies of a sample treated with bisulfite.
23. The kit according to claim 21 or claim 22 wherein the kit further comprises two template independent polymerases.
PCT/GB2016/051475 2015-05-22 2016-05-23 Nucleic acid sample enrichment WO2016189288A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
GB1508859.4 2015-05-22
GBGB1508859.4A GB201508859D0 (en) 2015-05-22 2015-05-22 Nucleic acid sample enrichment
GB1514532.9 2015-08-14
GBGB1514532.9A GB201514532D0 (en) 2015-08-14 2015-08-14 Nucleic acid sample enrichment
GBGB1522270.6A GB201522270D0 (en) 2015-12-17 2015-12-17 Nucleic acid sample enrichment
GB1522270.6 2015-12-17

Publications (1)

Publication Number Publication Date
WO2016189288A1 true WO2016189288A1 (en) 2016-12-01

Family

ID=56084174

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2016/051475 WO2016189288A1 (en) 2015-05-22 2016-05-23 Nucleic acid sample enrichment

Country Status (1)

Country Link
WO (1) WO2016189288A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9822394B2 (en) 2014-02-24 2017-11-21 Cambridge Epigenetix Limited Nucleic acid sample preparation
US10323269B2 (en) 2008-09-26 2019-06-18 The Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
WO2019114146A1 (en) * 2017-12-15 2019-06-20 格诺思博生物科技南通有限公司 Method for enriching gene target regions and library construction kit
US10428381B2 (en) 2011-07-29 2019-10-01 Cambridge Epigenetix Limited Methods for detection of nucleotide modification
US10563248B2 (en) 2012-11-30 2020-02-18 Cambridge Epigenetix Limited Oxidizing agent for modified nucleotides
US11062789B2 (en) 2014-07-18 2021-07-13 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US11410750B2 (en) 2018-09-27 2022-08-09 Grail, Llc Methylation markers and targeted methylation probe panel
US11435339B2 (en) 2016-11-30 2022-09-06 The Chinese University Of Hong Kong Analysis of cell-free DNA in urine
US11566284B2 (en) 2016-08-10 2023-01-31 Grail, Llc Methods of preparing dual-indexed DNA libraries for bisulfite conversion sequencing
US11984195B2 (en) 2018-10-15 2024-05-14 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050009059A1 (en) * 2003-05-07 2005-01-13 Affymetrix, Inc. Analysis of methylation status using oligonucleotide arrays
CN103866025A (en) * 2014-03-18 2014-06-18 中国海洋大学 Pre-amplification method for nucleic acid and application thereof
WO2014110272A1 (en) * 2013-01-09 2014-07-17 The Penn State Research Foundation Low sequence bias single-stranded dna ligation
US20140378340A1 (en) * 2002-06-17 2014-12-25 Affymetrix, Inc. Methods for Genotyping

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140378340A1 (en) * 2002-06-17 2014-12-25 Affymetrix, Inc. Methods for Genotyping
US20050009059A1 (en) * 2003-05-07 2005-01-13 Affymetrix, Inc. Analysis of methylation status using oligonucleotide arrays
WO2014110272A1 (en) * 2013-01-09 2014-07-17 The Penn State Research Foundation Low sequence bias single-stranded dna ligation
CN103866025A (en) * 2014-03-18 2014-06-18 中国海洋大学 Pre-amplification method for nucleic acid and application thereof

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10767216B2 (en) 2008-09-26 2020-09-08 The Children's Medical Center Corporation Methods for distinguishing 5-hydroxymethylcytosine from 5-methylcytosine
US10323269B2 (en) 2008-09-26 2019-06-18 The Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
US11208683B2 (en) 2008-09-26 2021-12-28 The Children's Medical Center Corporation Methods of epigenetic analysis
US10337053B2 (en) 2008-09-26 2019-07-02 Children's Medical Center Corporation Labeling hydroxymethylated residues
US10774373B2 (en) 2008-09-26 2020-09-15 Children's Medical Center Corporation Compositions comprising glucosylated hydroxymethylated bases
US10443091B2 (en) 2008-09-26 2019-10-15 Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
US10465234B2 (en) 2008-09-26 2019-11-05 Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
US10793899B2 (en) 2008-09-26 2020-10-06 Children's Medical Center Corporation Methods for identifying hydroxylated bases
US10533213B2 (en) 2008-09-26 2020-01-14 Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
US11072818B2 (en) 2008-09-26 2021-07-27 The Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
US10612076B2 (en) 2008-09-26 2020-04-07 The Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
US10731204B2 (en) 2008-09-26 2020-08-04 Children's Medical Center Corporation Selective oxidation of 5-methylcytosine by TET-family proteins
US10508301B2 (en) 2008-09-26 2019-12-17 Children's Medical Center Corporation Detection of 5-hydroxymethylcytosine by glycosylation
US10428381B2 (en) 2011-07-29 2019-10-01 Cambridge Epigenetix Limited Methods for detection of nucleotide modification
US10563248B2 (en) 2012-11-30 2020-02-18 Cambridge Epigenetix Limited Oxidizing agent for modified nucleotides
US9822394B2 (en) 2014-02-24 2017-11-21 Cambridge Epigenetix Limited Nucleic acid sample preparation
US11062789B2 (en) 2014-07-18 2021-07-13 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture
US11566284B2 (en) 2016-08-10 2023-01-31 Grail, Llc Methods of preparing dual-indexed DNA libraries for bisulfite conversion sequencing
US11435339B2 (en) 2016-11-30 2022-09-06 The Chinese University Of Hong Kong Analysis of cell-free DNA in urine
WO2019114146A1 (en) * 2017-12-15 2019-06-20 格诺思博生物科技南通有限公司 Method for enriching gene target regions and library construction kit
US11410750B2 (en) 2018-09-27 2022-08-09 Grail, Llc Methylation markers and targeted methylation probe panel
US11685958B2 (en) 2018-09-27 2023-06-27 Grail, Llc Methylation markers and targeted methylation probe panel
US11725251B2 (en) 2018-09-27 2023-08-15 Grail, Llc Methylation markers and targeted methylation probe panel
US11795513B2 (en) 2018-09-27 2023-10-24 Grail, Llc Methylation markers and targeted methylation probe panel
US11984195B2 (en) 2018-10-15 2024-05-14 The Chinese University Of Hong Kong Methylation pattern analysis of tissues in a DNA mixture

Similar Documents

Publication Publication Date Title
WO2016189288A1 (en) Nucleic acid sample enrichment
US10988795B2 (en) Synthesis of double-stranded nucleic acids
AU2015269103B2 (en) Method for identification and enumeration of nucleic acid sequence, expression, copy, or DNA methylation changes, using combined nuclease, ligase, polymerase, and sequencing reactions
US9822394B2 (en) Nucleic acid sample preparation
CA2559209C (en) Methods and compositions for generating and amplifying dna libraries for sensitive detection and analysis of dna methylation
CN109511265B (en) Method for improving sequencing by strand identification
JP7405485B2 (en) Nucleic acid enrichment and subsequent capture methods using site-specific nucleic acids
JP7240337B2 (en) LIBRARY PREPARATION METHODS AND COMPOSITIONS AND USES THEREOF
WO2016160965A1 (en) Methods and compositions for repair of dna ends by multiple enzymatic activities
CN114901818A (en) Methods of targeted nucleic acid library formation
WO2016170319A1 (en) Nucleic acid sample enrichment
EP3189157A1 (en) Preparation of adapter-ligated amplicons
WO2007109850A1 (en) Amplification of dna fragments
JP2020512845A (en) Methods, compositions, and kits for preparing nucleic acid libraries
CN110699425A (en) Method and system for enriching gene target region
US20240076653A1 (en) Method for constructing multiplex pcr library for high-throughput targeted sequencing
CN110468179B (en) Method for selectively amplifying nucleic acid sequences
US20210310061A1 (en) Dna amplification method for probe generation
WO2018009677A1 (en) Fast target enrichment by multiplexed relay pcr with modified bubble primers
CN116710573A (en) Insertion section and identification non-denaturing sequencing method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16725565

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16725565

Country of ref document: EP

Kind code of ref document: A1