WO2012116661A1 - Dna标签及其应用 - Google Patents

Dna标签及其应用 Download PDF

Info

Publication number
WO2012116661A1
WO2012116661A1 PCT/CN2012/071893 CN2012071893W WO2012116661A1 WO 2012116661 A1 WO2012116661 A1 WO 2012116661A1 CN 2012071893 W CN2012071893 W CN 2012071893W WO 2012116661 A1 WO2012116661 A1 WO 2012116661A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna
tag
sequencing
present
library
Prior art date
Application number
PCT/CN2012/071893
Other languages
English (en)
French (fr)
Inventor
刘琳
何毅敏
杨焕明
Original Assignee
深圳华大基因科技有限公司
深圳华大基因研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因科技有限公司, 深圳华大基因研究院 filed Critical 深圳华大基因科技有限公司
Publication of WO2012116661A1 publication Critical patent/WO2012116661A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags

Definitions

  • the invention relates to the field of nucleic acid sequencing technology, in particular to the field of multiplex nucleic acid sequencing technology.
  • the present invention relates to DNA tags and uses thereof, and more particularly, to DNA tags, oligonucleotides, DNA tag libraries, methods for their preparation, methods for determining DNA sample sequence information, and methods for DNA sequencing A method for a variety of DNA sample sequence information and a kit for constructing a DNA tag library. Background technique
  • Multiplex sequencing technology involves adding a DNA tag (also called Index) to identify the source of the sample between the linker sequence and the insert of each library, and using a tag-specific primer to mix multiple library samples. Sequencing is performed to determine the sequence of each library and its corresponding DNA tag, and then different libraries are distinguished based on each tag sequence to determine the DNA sequence of each library. Using multiplex sequencing technology, it is possible to achieve sequencing and sequencing of multiple library samples, thereby avoiding waste of sequencing resources.
  • a DNA tag also called Index
  • the current multiplex nucleic acid sequencing technology generally uses the same length of DNA tags to mix different libraries and simultaneously sequence them.
  • the detection insertion is often caused by the bias of the bases in the tags.
  • Fragment intensity parameters fluctuate, which affects the quality of the output data, resulting in unreliable data results, can not truly reflect the relevant information of the sample, and also leads to low repeatability of the experimental results.
  • the present invention at least solves one of the technical problems existing in the prior art.
  • the present invention provides a set of DNA tags (herein also simply referred to as "tags”) that can be used for multiplex nucleic acid sequencing and their applications.
  • the invention provides a set of isolated DNA tags.
  • the above DNA tag can be effectively applied to the current multiplex nucleic acid sequencing technology, that is, a DNA tag library of a plurality of samples can be simultaneously constructed by using the above DNA tag (in this case, sometimes referred to as a "tag library").
  • a DNA tag library of a plurality of samples can be simultaneously constructed by using the above DNA tag (in this case, sometimes referred to as a "tag library").
  • the DNA tag library derived from different samples can be mixed and sequenced, and the DNA sequence of the DNA tag library can be performed based on the DNA tag.
  • Classification which can obtain DNA sequence information of a variety of samples, thereby making full use of high-throughput sequencing technology, such as using Solexa sequencing technology, while sequencing multiple DNA tag libraries, thereby improving the sequencing efficiency of DNA tag libraries and Flux.
  • DNA tags of various samples can be constructed using DNA tags according to embodiments of the present invention, and multiplex nucleic acid sequencing can be performed to accurately perform a plurality of DNA tag libraries. Distinguish and effectively reduce data output bias, the resulting sequencing data is reliable, stable and repeatable.
  • the present invention provides a set of isolated oligonucleotides for introducing the above DNA tag into sample DNA or an equivalent thereof, thereby being applicable to multiplex nucleic acid sequencing technology.
  • a set of isolated oligonucleotides according to an embodiment of the invention having a first strand and a second strand, each of said strands being composed of a nucleotide represented by SEQ ID NO: (3N-1), respectively
  • these oligonucleotides respectively have a DNA tag according to an embodiment of the present invention as described above, and thus The corresponding DNA tag can be introduced into the DNA or its equivalent by a ligation reaction.
  • the first chain (also referred to herein as "sense sequence") and the second chain (also referred to herein as "antisense sequence”) are named IndexN adapter F and IndexN adapter R, respectively.
  • N 1 - 6 of any integer, the sequence of which is shown in Table 1 below (the sequence directions shown in the table are all 5' - 3' directions).
  • the corresponding DNA tag linker can be formed by subjecting the sense sequence IndexN adapter F and its corresponding antisense sequence IndexN adapter R to an equimolar annealing treatment.
  • the DNA tag is introduced into the DNA of the sample or its equivalent, whereby a DNA tag library having the DNA tag of the present invention can be constructed, thereby enabling accurate and efficient multiplex nucleic acid sequencing.
  • the inventors have surprisingly found that when constructing a DNA tag library containing various DNA tags using oligonucleotides having different tags for the same sample, the resulting sequencing data results are very stable and reproducible. .
  • the invention also provides a set of isolated PCR tag primers for introducing the above DNA tag into a sample DNA or the like.
  • a set of isolated PCR tag primers according to embodiments of the invention having a first strand and a second strand, wherein the first strand consists of the nucleotides set forth in SEQ ID NO: 39, respectively, and the nucleoside of the second strand ACGTGTGCTCTTCCGATCT, wherein XXXXX is one of a set of isolated DNA tags in accordance with an embodiment of the present invention.
  • the set of PCR tag primers respectively have the DNA tag according to the embodiment of the present invention as described above, and the PCR tag primer can be introduced into the DNA of the sample or its equivalent by PCR reaction using the PCR tag primer, thereby correspondingly
  • the DNA tag is introduced into the DNA or its equivalent, thereby obtaining a DNA tag library containing the corresponding DNA tag, thereby enabling the DNA tag library of the plurality of samples to be mixed and sequenced, that is, the PCR tag primer of the present invention can be effective Applied to multiplex nucleic acid sequencing technology.
  • the inventors have surprisingly found that when the DNA tag libraries containing various DNA tags are separately constructed using PCR tag primers having different tags for the same sample, the stability and reproducibility of the obtained sequencing data results are very high. it is good.
  • the present invention provides a method of constructing a DNA tag library.
  • the method comprises the steps of: fragmenting a DNA sample to obtain a DNA fragment; end-repairing the resulting DNA fragment to obtain a DNA fragment subjected to end repair; and performing the DNA fragment subjected to end repair A base A is added at the 3' end to obtain a DNA fragment in which the base A is added at the 3' end; a DNA fragment in which the base A is added at the 3' end is linked to a DNA tag linker to obtain a ligation product, wherein the DNA tag linker comprises a front One of a set of isolated DNA tags according to an embodiment of the present invention; amplifying the resulting ligation product to obtain an amplification product; and isolating and recovering the amplification product, the amplification product constituting a DNA tag library.
  • the method of constructing a DNA tag library according to an embodiment of the present invention can be effectively applied to a multiplex nucleic acid sequencing technology, and specifically, a method of constructing a DNA tag library according to an embodiment of the present invention can effectively perform an embodiment according to the present invention
  • the DNA tag is introduced into a DNA tag library constructed for the sample DNA, and the sequence information of the sample DNA and the sequence information of the DNA tag combination can be obtained by sequencing the DNA tag library, thereby distinguishing the source of the sample DNA.
  • the inventors have surprisingly found that different labels are used based on the above methods.
  • the constructed DNA tag library containing the DNA tag of the embodiment of the present invention is applied to the multiplex nucleic acid sequencing technology, the data output bias problem can be effectively reduced compared with the conventional multiplex nucleic acid sequencing technology, and the obtained sequencing data result is obtained. Reliable, stable and repeatable.
  • the present invention provides a DNA tag library.
  • the DNA tag library is obtained by a method of constructing a DNA tag library according to an embodiment of the present invention.
  • the inventors have found that the DNA tag library of the present invention can be effectively applied to a high-throughput sequencing platform, thereby efficiently obtaining accurate and reliable sequencing data, and the resulting sequencing data is stable and reproducible.
  • the invention provides a method of determining DNA sample sequence information.
  • the method comprises the steps of: constructing a DNA tag library of the DNA sample according to a method of preparing a DNA tag library according to an embodiment of the present invention; and sequencing the DNA tag library to determine the DNA sample Sequence information. Based on this method, the sequence information of the DNA sample in the DNA tag library and the sequence information of the DNA tag can be efficiently obtained, thereby enabling differentiation of the source of the DNA sample. Further, the inventors have surprisingly found that the use of the method according to an embodiment of the present invention to determine DNA sample sequence information can effectively reduce the problem of data output bias, and can accurately distinguish a plurality of DNA tag libraries.
  • the present invention also provides a method of determining a plurality of DNA sample sequence information.
  • the method comprises the steps of: constructing a DNA tag library of the DNA sample, independently of each of the plurality of samples, according to a method of constructing a DNA tag library according to an embodiment of the present invention, wherein a different DNA sample using DNA tags of different and known sequences; combining the DNA tag libraries of the plurality of samples to obtain a DNA tag library mixture; sequencing the DNA tag library mixture to obtain the Sequence information of the DNA sample and sequence information of the tag; and classifying sequence information of the DNA sample based on sequence information of the tag to determine DNA sequence information of the plurality of samples.
  • the method according to an embodiment of the present invention can make full use of high-throughput sequencing technology, for example, using Solexa sequencing technology, and simultaneously sequencing a DNA tag library of a plurality of samples, thereby improving the efficiency of DNA tag library sequencing and communication.
  • the quantity can simultaneously improve the efficiency of determining the sequence information of a plurality of DNA samples, and can effectively reduce the problem of data output bias compared with the conventional multiplex nucleic acid sequencing technology, and the sequencing data is more accurate and reliable.
  • the present invention provides a kit for constructing a DNA tag library.
  • a DNA tag according to an embodiment of the present invention can be conveniently introduced into a constructed DNA tag library.
  • the kit according to an embodiment of the present invention can be effectively applied to a multiplex nucleic acid sequencing technique.
  • the invention proposes the use of a set of different length, preferably gradient length tags for the construction and/or sequencing of a sequencing tag library, wherein the tag is an oligonucleotide sequence, preferably Is a nucleotide sequence of 2 - lOObp.
  • the tag is included in a PCR primer for amplifying a sequence of interest
  • the corresponding corresponding PCR primers are constructed and introduced into the sequence to be sequenced by PCR, and the PCR tag primer is used as a 5' primer of PCR, or a 3' primer, or a 5' primer and a 3' primer which are simultaneously used as PCR. .
  • the tag is embedded in a PCR primer for amplifying a sequence of interest, or with or without a linker and a PCR primer for amplifying a sequence of interest.
  • the 5' ends or the 3' ends are ligated to form respective corresponding tag PCR primers.
  • the tag is included in a linker of a tag library to form a corresponding tag link, the tag linker being used as a 5' linker, 3' linker of the tag library, or both As the 5' linker and 3' linker of the tag library.
  • the tag is inserted into the connector, or is connected to the end of the connector with or without a connector, preferably not connected to the end of the connector, thereby forming a self-corresponding tag connector.
  • the tag constitutes a tag PCR primer and a tag linker, and is used for the construction and/or sequencing of a sequence tag library.
  • the invention proposes a set of tags of different lengths, preferably gradient lengths, for use in the construction and/or sequencing of sequencing tag libraries.
  • the present invention proposes a set of gradient labels comprising or consisting of: 6 gradient labels shown in Table 1 or gradient labels having a base difference of 1 base At least 2, or at least 3, or at least 4, or at least 5, or all 6,
  • the set of gradient labels preferably includes at least Indexl and Index2, or Index3 and Index4, or Index5 and Index6 of the six gradient labels shown in Table 1, or a combination of any two or more of them.
  • the one base difference comprises one base substitution, addition or deletion in the sequence of the six gradient tags shown in Table 1.
  • the invention proposes the use of the gradient label described above for sequencing a library of tags, in particular the construction and/or sequencing of an Illumina/Solexa sequencing tag library, wherein the gradient tag is included for
  • the sequencing tag library in particular the 5' end of linker 2 of the IUumina/Solexa sequencing tag library, constitutes the corresponding corresponding gradient tag linker 2, which serves as a sequencing tag library, in particular 3' of the Illumina/Solexa sequencing tag library Connector.
  • the gradient label is contained in the 5' end of the joint 2, including the gradient label being connected to the end 5 of the joint 2 with or without a linker, or inserted into the 5' end of the joint 2 Preferably, it is not connected to the 5' end of the linker 2 by a linker.
  • the present invention proposes a sequencing tag library constructed using the aforementioned method, in particular
  • the invention proposes a set of gradient tag binders 2 comprising the gradient labels described above, comprising the gradient tags of claim 1 at the 5' end, and preferably for use as a sequencing tag library
  • the 3' linker of the Illumina/Solexa sequencing tag library the set of gradient tag linkers 2 comprising or consisting of: 6 gradient tag linkers 2 shown in Table 1 or differing from the gradient tag sequences contained therein 1 At least 2, or at least 3, or at least 4, or at least 5, or all 6 of the base-based joints
  • the set of gradient label joints 2 preferably includes at least Indexl adapter2 F/R and Index2 adapter2 F/R, or Index3 adapter2 F/R and Index4 adapter2 F/R of the six gradient label joints 2 shown in Table 1, Or Index5 adapter2 F/R and Index6 adapter2 F/R, or a combination of any two or more of them.
  • the invention proposes a gradient tag adaptor 2, wherein said one base difference comprises a substitution, addition or deletion of one base in the gradient tag sequence.
  • the invention proposes the use of a gradient tag adaptor 2 for sequencing a library of libraries, in particular an Illumina/Solexa sequencing tag library, for use as a sequencing tag library,
  • a gradient tag adaptor 2 for sequencing a library of libraries, in particular an Illumina/Solexa sequencing tag library, for use as a sequencing tag library,
  • the 3' linker of the Illumina/Solexa sequencing tag library is particularly advantageous.
  • the invention proposes a library of sequencing tags constructed using the gradient tag linker 2, in particular an Illumina/Solexa sequencing tag library, wherein the gradient tag linker 2 is used as a library of sequencing tags, in particular Illumina /Solexa sequencing tag library 3' linker.
  • the present invention provides a method of constructing a library of sequencing tags, in particular an Illumina/Solexa sequencing tag library, characterized in that a set of tags having labels of different lengths, preferably gradient lengths, is used.
  • a sequencing tag library used as a sequencing tag library, in particular a 3' linker for the Illumina/Solexa sequencing tag library.
  • a method of constructing a library of sequencing tags comprises:
  • Breaking DNA Breaking DNA by mechanical means to produce DNA fragments with sticky ends, including but not limited to Bioruptor, Hydroshear and Covaris;
  • End repair fill the sticky end of the DNA fragment by a ligation reaction
  • the method is characterized in that a different gradient linker 2 selected from Table 1 or a linker differing from the gradient tag sequence contained therein by 1 base is used as a sequencing tag library, in particular Illumina /Solexa sequencing tag library 3' linker.
  • a method of constructing a library of sequencing tags comprises:
  • n being an integer and 1 ⁇ 6, preferably 2 ⁇ 6, the genomic DNA sample being from any eukaryotic sample, including but not limited to human genomic DNA samples;
  • Breaking DNA Breaking DNA by mechanical means to produce DNA fragments with sticky ends, including but not limited to Bioruptor, Hydroshear, and Covaris;
  • End repair fill the sticky end of the DNA fragment by a ligation reaction
  • the linker 1 comprises the linker: 5 '-TACACTCTTTCCCTACACG ACGCTCTTCCGATCTATCACT (SEQ ID NO: 37) and 5, - /GTGATAGATCGGAAGAGCACAC GTCTGAACTCCAGTCAC (SEQ ID NO: 38).
  • the gradient label joint 2 comprises at least 2 of the 6 gradient label joints 2 shown in Table 1 or a linker which differs by 1 base from the gradient label sequence contained therein, or at least 3 , or at least 4, or at least 5, or all 6,
  • the set of gradient label joints 2 preferably includes at least Indexl adapter2 F/R and Index2 adapter2 F/, or Index3 adapter2 F/R and Index4 adapter2 F/R of the six gradient label joints 2 shown in Table 1, or Index5 adapter2 F/R and Index6 adapter2 F/R, or a combination of any two or more of them.
  • the one base difference comprises one base substitution, addition or deletion in the gradient tag sequence.
  • PCR in step 6 uses the following PCR primers:
  • the recovery of the library of the desired fragment in step 6) is carried out by agarose gel electrophoresis and gel recovery.
  • the sequencing method used in sequencing using the sequencing technique includes when the DNA is constructed
  • AC ACTCTTTCCCTAC ACGACGCTCTTC CGATCT.
  • the present invention also proposes a sequencing tag library constructed according to the foregoing method, in particular
  • Figure 1 is a schematic flow chart showing a method of constructing a DNA tag library according to an embodiment of the present invention
  • 2 is a schematic diagram showing the sequence of sequence reading of each sequencing reaction and the composition of the read sequence when sequencing a DNA tag library using the Illumina/Solexa sequencing platform according to an embodiment of the present invention
  • Figure 3 shows the results of comparison of the mass values of the first 10 Illumina/Solexa sequencing cycles of the DNA tag library of the present invention and the DNA tag library of the control according to an embodiment of the present invention, wherein
  • Figure 3A shows the mass values of the first 10 Illumina/Solexa sequencing cycles of the DNA tag library of the invention of Example 1.
  • Figure 3B shows the mass values of the top 10 Illumina/Solexa sequencing cycles of the DNA tag library of the control
  • Figure 4 is a graph showing the comparison of the light intensities of the respective Illumina/So lexa sequencing cycles of the DNA tag library of the present invention and the DNA tag library of the control according to an embodiment of the present invention, wherein
  • Figure 4A shows the average light intensity signal of each IUumina/Solexa sequencing cycle of the DNA tag library of the present invention of Example 1.
  • Figure 4B shows the average of the light intensity signals for each Illumina/Solexa sequencing cycle of the DNA tag library of the control
  • Figure 5 is a graph showing the results of comparison of base distributions of the Illumina/Solexa sequencing cycle of the DNA tag library of the present invention and the DNA tag library of the control according to an embodiment of the present invention, wherein
  • Figure 5A is a graph showing the percentage distribution of bases of the Illumina/Solexa sequencing cycle of the DNA tag library of the present invention of Example 1.
  • Figure 5B is a graph showing the percentage distribution of bases of the Illumina/Solexa sequencing cycle of the DNA tag library of the control
  • Figure 6 is a graph showing the comparison of the error rates of the Illumina/Solexa sequencing cycles of the DNA tag library of the present invention and the DNA tag library of the control according to an embodiment of the present invention, wherein
  • Figure 6A is a graph showing the error rate of the Illumina/Solexa sequencing cycle of the DNA tag library of the present invention of Example 1.
  • Figure 6B Error rate plot showing the Illumina/So lexa sequencing cycle of the DNA tag library of the control.
  • the invention provides a set of isolated DNA tags.
  • DNA as used in the present invention may be any polymer comprising deoxyribonucleotides, including but not limited to modified or unmodified DNA.
  • a DNA tag according to an embodiment of the present invention, a DNA tag library having a tag is obtained by linking the DNA tag to the DNA of the sample or its equivalent, and the sequence of the sample DNA and the sequence of the tag can be obtained by sequencing the DNA tag library. Further, based on the sequence of the tag, the sample source of the DNA can be accurately characterized.
  • a DNA tag library of a plurality of samples can be simultaneously constructed, and the DNA sequence of the sample can be classified based on the DNA tag by mixing and simultaneously sequencing the DNA tag library derived from different samples.
  • DNA tag attached to the DNA of the sample or its equivalent should be understood broadly, including the DNA tag can be directly linked to the DNA of the sample to construct a DNA tag library, and can also be associated with the DNA of the sample.
  • a nucleic acid of the same sequence for example, may be the corresponding RA sequence or cDNA sequence, which has the same sequence as the DNA).
  • the inventors of the present application found that: In the present invention, in order to design an effective DNA tag, it is first necessary to consider the degree of sequence difference between the tag sequences and the base recognition rate. Second, in the case of a label mix of less than 6 samples, the GT content of each base site on the mixed label must be considered. Because the excitation fluorescence of thiol G and T is the same in the Solexa sequencing process, the excitation light of bases A and C is the same, so the "balance" of the base “GT” content and the base “AC” content must be considered. The base base “GT” content is 50%, which ensures the highest label recognition rate and the lowest error rate. Finally, consider the repeatability and accuracy of the data output.
  • a set of DNA tags In order to achieve efficient construction of the DNA tag library and sequencing, a set of DNA tags must be constructed to ensure reliable results and high reproducibility. The same DNA sample ensures that a library of DNA tags constructed using different tags in the set of DNA tags will result in consistent sequencing results, thus ensuring reliable and reproducible results. In addition, it is also necessary to avoid the appearance of 3 or more consecutive bases in the tag sequence, because 3 or more consecutive bases increase the error rate of the sequence during synthesis or sequencing, and also Try to avoid the DNA tag linker and the PCR tag primer itself forming a hairpin structure.
  • the inventors of the present application performed a large number of screening work, and selected a set of isolated DNA tags according to an embodiment of the present invention, which are respectively represented by the nucleotides represented by SEQ ID NO: (3N-2)
  • the sequence is as shown in Table 1 above, and will not be described again.
  • the set of isolated DNA tags is a gradient tag. Specifically, Indexl is 6 bp, Index 2 is 7 bp, Index 1 is 8 bp, Indexl-3 has a gradient of lbp increment, and similarly, Index 4 is 6 bp.
  • Index5 is 7bp
  • Index6 is 8bp
  • Index4-6 has a gradient of lbp increment, which can effectively reduce the bias of bases in the tag, and then effectively reduce these base tags when applied to multiplex nucleic acid sequencing technology.
  • the fluctuation of the light intensity parameters during the sequencing process can significantly improve the quality of the output data, and further, can significantly improve the reliability, stability and repeatability of the sequencing data.
  • the difference between each DNA tag is 5 bases or more. When any one of the bases of the DNA tag has a sequencing error or a synthetic error, the final recognition of the tag is not affected.
  • These tags can be applied to the construction of any DNA tag library. Built. There are currently no reports on the construction of these tags for DNA sample sequencing and sequencing by Solexa technology.
  • the set of DNA tags used consists of: at least 2, or at least 3, or at least 3 of the DNA tags shown in Table 1 or a DNA tag 1 base different therefrom 4, or at least 5, or all 6.
  • the set of DNA tags preferably includes at least Index 1 and Index 2, or Index 3 and Index 4, or Index 5 and Index 6 of the six DNA tags shown in Table 1, or any two of them. Combination of one or more.
  • the one base difference comprises one base substitution, addition or deletion in the sequence of the six DNA tags shown in Table 1.
  • the present invention also provides the use of a DNA tag according to an embodiment of the present invention in a multiplex nucleic acid sequencing technique, in particular, the use of a DNA tag in the construction and sequencing of a DNA tag library.
  • the DNA tag of the present invention can be contained in a linker of a DNA tag library to constitute a corresponding DNA tag linker, which is used as a 5' linker of a DNA tag library, a 3' linker, or both The 5' linker and the 3' linker of the DNA tag library, whereby the DNA tag can be introduced into the DNA tag library by a ligation reaction.
  • the DNA tag is inserted into the end of the DNA tag linker, or is ligated to the end of the DNA linker with or without a linker, preferably without the linker in the end of the DNA tag linker, thereby constituting the respective phase Corresponding DNA tag linker.
  • the DNA tag of the present invention can be contained in a PCR primer for amplifying a sequence of interest to constitute a corresponding PCR tag primer, which is used as a 5' primer of a PCR reaction, or a 3' primer, Alternatively, it can be used as a 5' primer and a 3' primer of a PCR reaction, whereby a DNA tag can be introduced into a DNA tag library by a PCR method.
  • the DNA tag is embedded in the PCR tag primer, or is linked to the 5' end or the 3' end of the PCR primer for amplifying the sequence of interest by or without a linker, thereby constituting the corresponding PCR tag. Primer.
  • the present invention provides the use of the aforementioned DNA tag for the construction and/or sequencing of a DNA tag library, particularly an Illumina/Solexa sequencing DNA tag library.
  • the DNA tag is contained in a DNA tag library, particularly the 5' end of the 3' linker of the Illumina/Solexa sequencing DNA tag library, thereby constituting the corresponding DNA tag linker, which is used as DNA tag library, specifically the 3, linker of the Illumina/Solexa sequencing DNA tag library.
  • the present invention provides the use of the aforementioned DNA tag for the construction and/or sequencing of a DNA tag library, particularly an Illumina/Solexa sequencing DNA tag library.
  • the DNA tag is contained in a DNA tag library, particularly the 5' end of the 3' linker of the IUumina/Solexa sequencing DNA tag library, wherein the DNA tag is ligated to the 5' end of the 3' linker with or without a linker, or is inserted In the 5' end of the 3' linker.
  • the linker is a sequence of 1 - 10 bases, preferably a sequence of 1 - 5 bases, more preferably a sequence of 1 - 3 bases.
  • the invention provides a set of isolated oligonucleotides that can be used to introduce a DNA tag as described above into the DNA of a sample, thereby constructing a library of DNA tags.
  • the corresponding oligonucleotides can be formed by annealing the first strand and the second strand constituting the corresponding oligonucleotide, respectively.
  • the above oligonucleotides each have a DNA tag according to an embodiment of the present invention as described above, and thus the corresponding DNA tag can be introduced into the DNA of the sample or its equivalent by a ligation reaction.
  • the sequences of these oligonucleotides are as shown in Table 1 above, and are not described herein again.
  • the set of DNA tag linkers which comprise a DNA tag of the invention at the 5' end, and preferably serve as a DNA tag library, in particular a 3' linker of an Illumina/Solexa sequencing DNA tag library .
  • the set of DNA tag linkers comprises or consists of: 6 DNA tag linkers shown in Table 1 or at least 2 of the DNA tag linkers differing from the DNA tag sequence contained therein by 1 mechanical group , or at least 3, or at least 4, or at least 5, or all 6.
  • these DNA tag linkers preferably include at least Indexl adapter F/R and Index2 adapter F/R, or Index3 adapter F/R and Index4 adapter F in the six DNA tag headers 2 shown in Table 1. /R, or Index5 adapter F/R and Index6 adapter F/R, or a combination of any two or more of them.
  • the one base difference in the DNA tag linker described above includes one base substitution, addition or deletion in the sequence of the six DNA tags shown in Table 1.
  • the invention provides the use of a DNA tag linker for the construction and/or sequencing of a DNA tag library, in particular an Illumina/Solexa sequencing DNA tag library, which is used as a DNA tag library, in particular Is the 3' linker of the Illumina/Solexa sequencing DNA tag library.
  • the present invention provides a set of isolated PCR tag primers which can be used to introduce a DNA tag as described above into the DNA of a sample, thereby constructing a DNA tag library.
  • the set of isolated PCR tag primers has a first strand and a second strand, wherein the first strand is composed of the nucleotides represented by SEQ ID NO: 39, and the second strand is nucleoside
  • the acid sequence is as follows: CAAGCAGAAGACGGCATACGAGATXXX XXGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT, wherein XXXXX is one of a group of isolated DNA tags according to an embodiment of the present invention.
  • the PCR tag primers of the present invention each have a DNA tag according to an embodiment of the present invention as described above, and a PCR tag primer can be introduced into a sample DNA or an equivalent thereof by a PCR reaction using a PCR tag primer.
  • the corresponding DNA tag is introduced into DN A or its equivalent, thereby enabling introduction of the DNA tag into the DNA tag library, thereby enabling sequencing of multiple samples by performing hybrid sequencing of the DNA tag library of each sample. And can effectively reduce the problem of data output bias, the obtained sequencing data is reliable, stable, and repeatable.
  • the PC of the present invention The R-tag primer can be used as a 5' primer or a 3' primer for PCR reaction, or as both a 5' primer and a 3' primer for PCR reaction, thereby enabling introduction of a DNA tag into a DNA tag library by PCR. .
  • a DNA tag library constructed using the above DNA tag linker and PCR tag primer is also provided.
  • the present invention also provides a method of constructing a DNA tag library using the above DNA tag linker.
  • the method may include:
  • a DNA sample is fragmented to obtain a DNA fragment.
  • the method of fragmenting a DNA sample is not particularly limited.
  • the fragmentation is carried out by at least one selected from the group consisting of an atomization method, an enzyme digestion method, and an ultrasonication method.
  • the DNA sample is fragmented using ultrasonic disruption.
  • the DNA sample is fragmented using a Covaris interrupter.
  • the obtained DNA fragment can be about 500 bp in length, whereby the efficiency of constructing a DNA tag library and subsequent sequencing can be further improved.
  • the source of the DNA sample is not particularly limited.
  • the DNA sample may be derived from any common biological sample, such as a plant, animal or microorganism.
  • the DNA sample may be derived from at least one selected from the group consisting of Arabidopsis thaliana, rice, human, mouse, and Escherichia coli.
  • a DNA tag library of a plurality of common model organisms can be efficiently constructed.
  • the DNA sample may be a human DNA sample, and more specifically, may be a human genomic DNA sample.
  • the DNA fragment is end-repaired to obtain a DNA fragment that has been repaired at the end.
  • the method of performing end repair of the DNA fragment is not particularly limited.
  • the end repair is carried out using a Klenow fragment, T4 DNA polymerase and T4 polynucleotide kinase, wherein the Klenow fragment has 5 ' ⁇ 3 'polymerase activity and 3 ' ⁇ 5 ' exonuclease Active, but lacking 5 ' ⁇ 3 ' exonuclease activity.
  • the terminal-repaired DNA fragment is subjected to the base A at the 3' end to obtain a DNA fragment in which the base A is added at the 3' end.
  • the method of adding the base A at the 3' end of the DNA fragment which has been subjected to end repair is not particularly limited.
  • Klenow (3 '-5' exo-) can be used to carry out the end-repaired DNA fragment 3, and add base A.
  • bases are added to the 3' ends of the two oligonucleotide strands of the end-repaired DNA fragment.
  • a DNA fragment in which the base A is added at the 3' end is ligated to a DNA tag linker to obtain a ligation product, wherein the DNA tag linker comprises one of a group of isolated DNA tags according to an embodiment of the present invention.
  • a method of attaching a DNA fragment in which the base A is added to the DNA tag linker at the 3' end is not particularly limited.
  • the DNA fragment in which the base A is added to the 3' end and the DNA tag linker is carried out using T4 DNA ligase.
  • the DNA tag linker may be one selected from the isolated oligonucleotides according to the embodiments of the present invention as described above. Then, the resulting ligation product is amplified to obtain an amplification product. According to an embodiment of the invention, prior to the amplification of the ligation product, the step of fragment selection of the ligation product is further included.
  • fragment selection can be performed using 2% agarose gel electrophoresis.
  • the length of the ligation product selected by the fragment may be about 620 bp.
  • the ligation product is subjected to amplification: by a PCR reaction using primers having nucleotide sequences as shown in SEQ ID NO: 39 and SEQ ID NO: 40, respectively.
  • the obtained amplification product is separated and recovered, and the amplification product constitutes a DNA tag library.
  • the method for isolating and recovering the amplified product is also not particularly limited, and those skilled in the art can select an appropriate method and apparatus for separation according to the characteristics of the amplified product, for example, by electrophoresis and recovering a PCR of a specific length.
  • the method of amplifying the product is recovered.
  • the amplification product may be isolated by at least one selected from the group consisting of agarose gel electrophoresis, magnetic bead purification, and purification column purification.
  • the isolated amplified product may be about 620 bp in length.
  • a DNA tag according to an embodiment of the present invention can be efficiently introduced into a DNA tag library constructed for a DNA sample.
  • the DNA tag library can be sequenced to obtain sequence information of the DNA sample and sequence information of the DNA tag, thereby distinguishing the source of the DNA sample.
  • the method for constructing a DNA tag library provided by the invention can fully utilize a high-throughput sequencing platform, meets the requirements of high-throughput sequencing, saves sequencing resources, thereby reducing the cost of sequencing, and can effectively be compared with the conventional multiplex nucleic acid sequencing technology. To reduce the bias of data output, the obtained sequencing data is reliable, stable, and reproducible.
  • the method of constructing a DNA tag library of the present invention may include:
  • genomic DNA sample can be from any eukaryotic sample, including but not limited to human genomic DNA samples;
  • Linkers are ligated to obtain a ligation product, wherein the DNA tag linker comprises a set of isolated DNA tags according to embodiments of the invention
  • the method for constructing a DNA tag library of the present invention may further comprise: 1) providing n genomic DNA samples, n being an integer and 1 ⁇ n ⁇ 6, preferably 2 ⁇ n ⁇ 6, wherein the genomic DNA sample can be from any eukaryotic sample, including but not limited to human genomic DNA samples;
  • the linker of the Illumina/Solexa sequencing library used in the method for constructing a DNA tag library of the present invention described above is a linker 1 whose sequence is as follows:
  • the DNA tag linker used in the above method for constructing a DNA tag library of the present invention comprises or consists of: 6 DNA tag linkers shown in Table 1 or a DNA tag sequence contained therein is different from one At least two, or at least three, or at least four, or at least five, or all six of the base DNA tag linkers.
  • these DNA tag linkers preferably include at least Index 1 adapter F/R and Index 2 adapter F/R, or Index 3 adapter F/R and Index 4 adapter in the 6 DNA tag connectors 2 shown in Table 1.
  • the one base difference in the DNA tag linker described above includes a substitution, addition or deletion of one of the sequences of the six DNA tags shown in Table 1.
  • the PCR reaction in the above method for constructing a DNA tag library of the present invention employs the following PCR primers:
  • the method of constructing a DNA tag library of the present invention may further comprise:
  • a set of DNA tags of different lengths of the present invention may be separately contained in PCR primers for amplifying a sequence of interest, thereby constituting respective corresponding PCR tag primers, thereby, the present invention
  • the method for constructing the DNA tag library may include: firstly, the total DNA sample is disrupted into a DNA fragment of a certain length by mechanical method or enzymatic cleavage, and after the DNA fragment is ligated to the linker, the PCR-containing primer pair contains the DNA fragment ( That is, the ligation product of the above-mentioned target sequence is amplified, and finally the amplified product is recovered by agarose electrophoresis and gel-cutting, thereby obtaining a DNA tag library composed of the amplified product.
  • a set of DNA tags of different lengths of the present invention can be respectively embedded in a linker (e.g., a terminal) of an existing library to constitute a corresponding DNA tag linker, thereby constructing the present invention.
  • the method of the DNA tag library may include: firstly, the total DNA sample is disrupted into a length of the DNA fragment by mechanical or enzymatic cleavage, and after the DNA fragment is end-repaired and a random sticky end is formed, the DNA tag is combined with the DNA tag.
  • the linker is ligated to obtain a ligation product, and then the ligation product containing the DNA fragment (ie, the sequence of interest described above) is amplified by a specific PCR primer, and finally the amplified product is recovered by agarose electrophoresis and gel-cutting.
  • a DNA tag library consisting of amplification products.
  • a set of DNA tags of different lengths of the present invention may be separately contained in PCR primers for amplifying a sequence of interest, thereby constituting respective corresponding PCR tag primers, and at the same time one of the present invention
  • the DNA tags of different lengths are respectively inserted into the adaptors (for example, the ends) of the existing library, thereby constituting the corresponding DNA tag linkers.
  • the method for constructing the DNA tag library of the present invention may include: first using the total DNA sample Mechanically or enzymatically interrupting a DNA fragment of a certain length, and then ligating the DNA fragment to a DNA tag linker to obtain a ligation product, and then using a PCR tag primer pair containing the DNA fragment (ie, the sequence of interest described above) The ligation product is extensively increased, and finally the product is recovered by agarose electrophoresis and gel-cutting to obtain a DNA tag library composed of the amplified product.
  • a method for constructing a DNA tag library particularly an Illumina/Solexa sequencing DNA tag library, using the DNA tag linker of the present invention shown in Table 1 is used as a DNA tag library, in particular Is the linker of the Illumina/Solexa sequencing DNA tag library.
  • the present invention also provides a kit for constructing a DNA tag library.
  • a DNA tag according to an embodiment of the present invention can be conveniently introduced into a constructed DNA tag library.
  • a DNA tag library can also be included in the kit, and details are not described herein.
  • the present invention also provides a DNA tag library which is constructed in accordance with the present invention
  • the DNA tag library having the DNA tag according to an embodiment of the present invention may Effectively applied to high-throughput sequencing technologies such as Solexa technology, the obtained nucleic acid sequence information such as DNA sequence information can be accurately classified by sample source by obtaining a tag sequence, and the obtained sequencing data is accurate and reliable. Good stability and repeatability.
  • the DNA tag library of the invention employs a DNA tag linker as a DNA tag library, in particular a 3' linker of the IUumina/Solexa sequencing tag library.
  • the present invention also provides a method of determining DNA sample sequence information.
  • the method comprises: constructing a DNA tag library according to a method for constructing a DNA tag library according to an embodiment of the present invention; and then, sequencing the constructed DNA tag library to determine sequence information of the DNA sample. Based on this method, the sequence information of the DNA sample in the DNA tag library and the sequence information of the DNA tag can be efficiently obtained, thereby distinguishing the source of the DNA sample.
  • the inventors have surprisingly found that the use of a method according to an embodiment of the present invention to determine DNA sample sequence information can effectively reduce the problem of data output bias, and can accurately distinguish a plurality of DNA tag libraries, and obtain sequencing.
  • the constructed DNA tag library can be sequenced by any known method, and the type thereof is not particularly limited.
  • DNA tag libraries can be sequenced using Solexa sequencing technology.
  • suitable sequencing primers can be selected for sequencing according to specific conditions.
  • the present invention provides a method of determining sequence information for a plurality of DNA samples.
  • the method comprises the steps of: constructing a DNA tag library of the DNA sample according to the method of constructing the DNA tag library of the embodiment of the present invention, respectively, for each of the plurality of samples, wherein Different DNA samples use DNA tags of different and known sequences; DNA library libraries of various samples are combined to obtain a DNA tag library mixture; DNA tag library mixture is sequenced to obtain sequence information of the DNA sample and Sequence information of the label combination; and sorting the sequence information of the DNA sample based on the sequence information of the label combination to determine DNA sequence information of the plurality of samples.
  • the term "various" is used in at least two.
  • the method of sequencing the DNA tag library mixture is not particularly limited.
  • the obtained DNA tag library mixture can be sequenced using Solexa sequencing technology to obtain sequence information of the DNA sample and sequence information of the tag.
  • the method according to an embodiment of the present invention can make full use of high-throughput sequencing technology, for example, using Solexa sequencing technology, and simultaneously sequencing DNA libraries of various samples, thereby improving the efficiency and throughput of DNA library sequencing. And can effectively reduce the problem of data output bias, the obtained sequencing data is accurate, reliable, stable and repeatable.
  • 1 to 2 ⁇ ⁇ human peripheral blood genomic DNA samples were taken for sample detection. Specifically, the concentration of the DNA sample, the ratio of OD260/280, and the ratio of OD260/230 were measured using NanoDrop 1000, and the integrity of the sample was detected by agarose gel electrophoresis, and after the test was passed, it was used.
  • Sample concentration The concentration of the sample should not be lower than 100 ng/ ⁇ ; Sample purity: OD260/280 ratio should be between 1.8 and 2.0, no protein, polysaccharide and RA contamination; sample integrity: electrophoresis results show that DNA samples should not be degraded;
  • the total sample amount is required to be not less than 45 ⁇ ⁇ :
  • the above-tested DNA sample was fragmented to obtain a DNA fragment.
  • there are two methods for fragmenting DNA samples namely Nebulization and Covaris, which can break the DNA of the sample to a range of 100-800 b and the main band is about 500 bp (if If the sample is interrupted DNA, you can skip this step).
  • the DNA sample was fragmented by atomization, and the obtained DNA fragment was 100-800 bp in length and 500 bp in the main band.
  • the conventional label joint mixture was obtained by combining the conventional label joints formed by annealing the corresponding sense sequence PE IndexN adapter F and the antisense sequence PE IndexN adapter R shown in Table 2 above.
  • the PCR reaction uses the following PCR primers:
  • the amplified product was electrophoresed on a 2% agarose gel at 100 V for 120 min, and then 620 bp (the length of the cut fragment can be calculated by n+120 bp, where n insert size) was used, and then used.
  • the QIAquick Gel Extraction Kit (Qiagen) recovers the cut pieces to obtain a DNA tag library, which is then dissolved in 40 ⁇ l of elution buffer for later use.
  • a DNA tag library was constructed using the Multiplexing Sample Preparation Oligonucleotide Kit (PE-400-1001), using the standard protocol of the kit, using the six DNA tags according to the examples of the present invention in Table 1 and their corresponding DNA tag adapters.
  • the specific steps are the same as the control examples, where the difference is:
  • Step 5 using a DNA tag adapter mixture (Index Adapter Oligo mix) instead of a conventional label linker mixture (PE Index Adapter Oligo Mix), wherein the DNA tag linker mixture is obtained by using the sense sequence IndexN adapter F and the counter shown in Table 1.
  • a DNA tag linker formed by annealing the index sequence IndexN adapter R is obtained by combination.
  • Step 7 The procedure for the PCR reaction is:
  • the DNA tag libraries constructed in the Comparative Example and Example 1 were sequenced using a HiSeq2000 sequencer in strict accordance with the recommended procedure of the instrument to obtain sequencing data.
  • the sequencing primers used are:
  • Figure 2 shows a sequence of sequence reads and their sequence composition for each Reads when sequencing a DNA tag library using the Solexa sequencing platform. As shown in Figure 2, where Readl represents the sequence detected by Sequencing Reaction 1, Read 1 Seq Primer represents the sequencing primer used.
  • Data processing software includes, but is not limited to, HiSeq Control Software (HCS), Pipeline. CASAVA, SOAP, and ELAND.
  • FIG. 3 shows the results of comparison of the mass values (Q20) of the first 10 sequencing cycles of the DNA tag library of the present invention of Example 1 and the DNA tag library of the comparative example, wherein A shows the invention of the examples.
  • the mass value of the first 10 sequencing cycles of the DNA tag library B shows the mass values of the first 10 sequencing cycles of the DNA tag library of the control.
  • the abscissa indicates the number of cycles
  • the ordinate indicates the quality.
  • the quality value (Q-Vahie) can reflect the quality of sequencing, and it is between 0-40. Within this range, the higher the value, the better the quality.
  • Q20 refers to the proportion of bases with mass values greater than 20 in all bases, which can reflect the quality of the sequence sequenced.
  • the DNA tag of the present invention is constructed by embedding a DNA tag of a gradient length change of the present invention into a linker, constructing a DNA tag library of the present invention, and sequencing the DNA tag library using the Illumina/Solexa technique.
  • the mass value Q20 of the first 10 sequencing cycles of the library was maintained at 0.9; a 6 bp fixed-length conventional tag was inserted into the adaptor to construct a DNA library of the control, and the DNA tag library was sequenced using the Illumina/Solexa technique.
  • Fig. 4-6 shows the results of comparison of changes in light intensity, base distribution, and error rate with the number of cycles of the DNA library of the present invention in Example 1 and the tag library of the control.
  • 4 shows the results of comparison of the light intensities of the respective sequencing cycles of the DNA tag library of Example 1 and the tag library of the control example, wherein A shows the light intensity of each sequencing cycle of the DNA tag library of the present invention of Example 1.
  • the signal average, B shows the average of the light intensity signals for each round of sequencing cycles of the DNA tag library of the control.
  • the abscissa indicates the number of cycles, and the ordinate indicates the signal intensity average (Sign Mean).
  • FIG. 5 shows the results of comparison of base distributions along the Reads position of Solexa sequencing data of the DNA tag library of the present invention and the DNA tag library of the control according to an embodiment of the present invention, wherein A shows the inventive example of Example 1.
  • the abscissa indicates the position along Reads
  • the ordinate indicates the percentage of each base at the position of the Reads
  • the graph shows the various base ratios measured in each sequencing.
  • FIG. 6 shows the results of comparison of error rates along the Reads position of Solexa sequencing data of the DNA tag library of the present invention and the DNA tag library of the control according to an embodiment of the present invention, wherein A shows the DNA of the present invention of Example 1.
  • the error rate map of the Solexa sequencing data of the tag library along the Reads position B shows the DNA tag library of the control example
  • the error rate graph of Solexa sequencing data along the Reads location As shown in Fig. 6, the abscissa indicates the position along Reads, the ordinate indicates the error rate (ie, the proportion of the sequencing error occurs at the position of the Reads), and the solid line indicates the error rate (ie, at the position of the Reads, the sequencing error occurs).
  • the ratio of the base line indicates the proportion of bases that cannot be analyzed.
  • the figure shows the difference in error rate between different libraries.
  • the DNA tag library of the present invention of Example 1 has no significant difference from the tag library of the control sample from the three parameters of light intensity, base distribution, and error rate as a function of the number of cycles. . It is shown that the DNA tag library constructed using the DNA tag of the gradient length variation of the present invention and the DNA tag library constructed using the conventional tag of the fixed length are not significantly different as a whole, and the use of the DNA tag of the present invention does not affect the DNA.
  • the overall sequencing results of the tag library, and the sequencing of the transition from the gradient tag to the insert can significantly improve the quality of the base in excess.
  • the DNA tag of the present invention is illustrated to be superior to conventional tags in conventional multiplex nucleic acid sequencing techniques.
  • the tag library can increase 83.5 M of data when running a HiSeq sequencer and can increase the availability of data.
  • the DNA tag library kit can be effectively applied to the construction and sequencing of DNA sequencing libraries of sample DNA, and can effectively reduce the problem of data output bias, and the sequencing results are accurate, reliable, stable and reproducible.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Description

DNA标签及其应用 优先权信息
本申请请求 2011 年 3 月 3 日向中国国家知识产权局提交的、 专利申请号为 201 110050238.1的专利申箭的优先权和权益, 并且通过参照将其全文并入此处。 技术领域
本发明涉及核酸测序技术领域, 特别是多重核酸测序技术领域。 具体地, 本发明涉 及 DNA标签及其应用, 更具体地, 本发明提供了用于 DNA测序的 DNA标签、 寡核苷 酸、 DNA标签文库及其制备方法、 确定 DNA样品序列信息的方法、 确定多种 DNA样 品序列信息的方法以及用于构建 DNA标签文库的试剂盒。 背景技术
多重测序( Multiplex sequencing )技术, 是在每个文库的接头序列和***片段之间 加入一段用于识别样品来源的 DNA标签(也称为 Index ) , 并利用标签特异性引物对 多个文库样品混合测序, 以便确定各文库以及其对应的 DNA标签的序列, 然后基于各 标签序列区分不同的文库, 从而确定各文库的 DNA序列。 釆用多重测序技术, 能够实 现对多个文库样品混合测序, 从而能够避免测序资源浪费。
然而, 目前的多重核酸测序技术, 仍有待改进。 发明内容
本发明是基于发明人的下列发现而完成的:
目前的多重核酸测序技术, 一般是利用同样长度的 DNA标签, 混合不同文库同时 进行测序, 釆用该技术对多个文库的混合物进行测序时, 常会因为标签中碱基的偏向性 而导致检测***片段时光强参数波动,从而影响输出数据的质量,导致数据结果不可信, 不能真实的反映样品的相关信息, 同时也导致实验结果重复性低。
本发明 在至少解决现有技术中存在的技术问题之一。 为此, 本发明提供了一组能 够用于多重核酸测序的 DNA标签(在本文中, 有时也简单地称为 "标签" )及其应用。
根据本发明的一个方面, 本发明提供了一组分离的 DNA标签。 根据本发明的实施例, 该组分离的 DNA标签由 SEQ ID NO: ( 3N-2 )所示的核苷酸构成, 其中 N=l-6的任意整数。 在本说明书中, 这些 DNA标签分别被命名为 IndexN, 其中 N=l-6的任意整数, 其序列如 下表 1所示。 利用上述根据本发明实施例的 DNA标签, 通过将 DNA标签与样品 DNA或 其等同物相连, 可以精确地表征 DNA的样品来源。 由此, 能够将上述 DNA标签有效地应 用到目前的多重核酸测序技术中, 即利用上述 DNA标签, 可以同时构建多种样品的 DNA 标签文库(在本文中, 有时也称为 "标签文库 "), 从而可以通过将来源于不同样品的 DNA 标签文库混合之后进行测序, 并且能够基于 DNA标签对 DNA标签文库的 DNA序列进行 分类, 从而可以获得多种样品的 DNA序列信息, 由此可以充分利用高通量的测序技术, 例 如利用 Solexa测序技术, 同时对多种 DNA标签文库进行测序, 从而提高 DNA标签文库的 测序效率和通量。 发明人惊奇地发现, 相对于目前的多重核酸测序技术, 利用根据本发明 实施例的 DNA标签构建多种样品的 DNA标签文戽, 并进行多重核酸测序, 能够精确地对 多种 DNA标签文库进行区分, 并且能够有效地减少数据产出偏向性问题, 所得到的测序数 据可靠, 稳定性和可重复性非常好。
根据本发明的另一方面, 本发明提供了用于将上述 DNA标签引入样品 DNA或其等同 物中, 从而能够应用于多重核酸测序技术的一组分离的寡核苷酸。 根据本发明的实施例的 一组分离的寡核苷酸, 具有第一链和第二链, 所述第一链分别由 SEQ ID NO: ( 3N-1 )所示 的核苷酸构成, 所述第二链分别由 SEQ ID NO: ( 3N )所示的核苷酸构成, 其中, 对于相同 的寡核苷酸, 其第一链和第二链的 N取值相同, 并且 N=l-6的任意整数。 根据本发明的实 施例, 这些寡核苷酸(在本说明书中, 有时也称为 "DNA标签接头"、 "标签接头")分别具 有如前所述的根据本发明实施例的 DNA标签, 因而可以通过连接反应, 将相应的 DNA标 签引入到 DNA或其等同物中。 与 DNA标签的命名方法类似, 在本说明书中, 与 DNA标 签 IndexN相对应的寡核苷酸( DNA标签接头)被命名为 IndexN adapter, 其中 N=l-6的任 意整数, 进一步, DNA标签接头的第一链(在本文中, 有时也称为 "有义序列")和第二链 (在本文中,有时也称为 "反义序列 " )分别被命名为 IndexN adapter F和 IndexN adapter R, 其中 N=l-6的任意整数, 其序列如下表 1所示(表中所示序列方向均是 5' - 3'方向)。 根据 本发明的实施例,可以通过将有义序列 IndexN adapter F和其相应的反义序列 IndexN adapter R进行等摩尔退火处理而形成相应的 DNA标签接头。
Figure imgf000003_0001
Index5 AGAGATA(13)
Index5 adapter F TACACTCTTTCCCTACACGACGCTCTTCCGATCTAGAGATAT( 14)
Index5 adapter 5 ' -TATCTCTAGATCGGAAGAGC ACACGTCTGAACTCCAGTC AC( 15)
Index6 GCTCTCGA(16)
Index6 adapter F TAC ACTCTTTCCCTAC ACGACGCTCTTCCGATCTGCTCTCGAT( 17)
Index6 adapter 5 ' -TCGAGAGCAGATCGGAAGAGC AC ACGTCTGAACTCC AGTC AC( 18) 利用上述根据本发明实施例的寡核苷酸(也可以成为 DNA标签接头), 能够有效地将
DNA标签引入到样品的 DNA或其等同物中,由此能够构建具有本发明的 DNA标签的 DNA 标签文库, 进而能够准确有效地进行多重核酸测序。 另外, 发明人惊奇地发现, 当针对相 同的样品, 采用具有不同标签的寡核苷酸构建含有各种 DNA标签的 DNA标签文库时, 所 得到的测序数据结果的稳定性和可重复性非常好。
根据本发明的另一方面, 本发明还提供了用于将上述 DNA标签引入样品 DNA或其等 同物中的一组分离的 PCR标签引物。才 据本发明的实施例的一组分离的 PCR标签引物, 具 有第一链和第二链, 其中第一链分别由 SEQ ID NO: 39所示的核苷酸构成, 第二链的核苷 ACGTGTGCTCTTCCGATCT , 其中, XXXXXX为根据本发明实施例的一组分离的 DNA 标签的一种。 该组 PCR标签引物其分别具有如前所述的根据本发明实施例的 DNA标签, 通过采用 PCR标签引物的 PCR反应,可以将 PCR标签引物引入样品的 DNA或其等同物中, 从而就将相应的 DNA标签引入到 DNA或其等同物中, 进而能够获得含有相应的 DNA标 签的 DNA标签文库, 由此, 能够将多种样品的 DNA标签文库进行混合测序, 即本发明的 PCR标签引物能够有效地应用于多重核酸测序技术。 另外, 发明人惊奇地发现, 当针对相 同的样品,采用具有不同标签的 PCR标签引物分别枸建含有各种 DNA标签的 DNA标签文 库时, 所得到的测序数据结果的稳定性和可重复性非常好。
根据本发明的又一方面, 本发明提供了一种构建 DNA标签文库的方法。根据本发明的 实施例, 该方法包括以下步驟: 将 DNA样品片段化, 以便获得 DNA片段; 将所得的 DNA 片段进行末端修复, 以便获得经过末端修复的 DNA片段; 将经过末端修复的 DNA片段进 行 3'末端添加碱基 A, 以便获得 3 '末端添加碱基 A的 DNA片段; 将 3'末端添加碱基 A的 DNA片段与 DNA标签接头相连, 以便获得连接产物, 其中该 DNA标签接头包含前面所述 根据本发明的实施例的一组分离的 DNA标签的一种; 将所得的连接产物进行扩增, 以便获 得扩增产物; 以及分离回收扩增产物, 该扩增产物构成 DNA标签文库。
根据本发明实施例的构建 DNA标签文库的方法,能够有效地应用于多重核酸测序技术, 具体地, 利用根据本发明实施例的构建 DNA标签文库的方法, 能够有效地将根据本发明实 施例的 DNA标签引入到针对样品 DNA所构建的 DNA标签文库中, 进而能够通过对 DNA 标签文库进行测序, 获得样品 DNA的序列信息以及 DNA标签组合的序列信息, 进而能够 对样品 DNA的来源进行区分。 另外, 发明人惊奇地发现, 基于上述方法, 采用不同的标签 构建的含有才 居本发明实施例的 DNA标签的 DNA标签文库应用于多重核酸测序技术时, 相对于常规的多重核酸测序技术, 能够有效地减少数据产出偏向性问题, 所得到的测序数 据结果可靠、 稳定, 可重复性非常好。
进一步, 本发明提供了一种 DNA标签文库。 根据本发明的实施例, 该 DNA标签文库 是通过根据本发明实施例的构建 DNA标签文库的方法获得的。发明人发现,本发明的 DNA 标签文库能够有效地应用于高通量测序平台, 从而能够有效地获得准确、 可靠的测序数据, 并且所得的测序数据稳定性和可重复性好。
根据本发明的另一方面, 本发明提供了一种确定 DNA样品序列信息的方法。根据本发 明的实施例, 其包括下列步骤: 根据本发明实施例的制备 DNA标签文库的方法构建所述 DNA样品的 DNA标签文库; 以及对所述 DNA标签文库进行测序, 以便确定所述 DNA样 品的序列信息。 基于该方法, 能够有效地获得 DNA标签文库中 DNA样品的序列信息以及 DNA标签的序列信息, 从而能够对 DNA样品的来源进行区分。 另外, 发明人惊奇地发现, 利用根据本发明实施例的方法确定 DNA样品序列信息,能够有效地减少数据产出偏向性的 问题, 并且能够精确地对多种 DNA标签文库进行区分。
根据本发明的再一方面, 本发明还提供了一种确定多种 DNA样品序列信息的方法。 根 据本发明的实施例, 其包括以下步骤: 针对所述多种样品的每一种, 分别独立地根据本发 明实施例的构建 DNA标签文库的方法, 构建所述 DNA样品的 DNA标签文库, 其中, 不 同的 DNA样品采用相互不同并且已知序列的 DNA标签; 将所述多种样品的 DNA标签文 库进行组合, 以便获得 DNA标签文库混合物; 对所述 DNA标签文库混合物进行测序, 以 便获得所述 DNA样品的序列信息以及所述标签的序列信息; 以及基于所述标签的序列信息 对所述 DNA样品的序列信息进行分类, 以便确定所述多种样品的 DNA序列信息。 由此, 根据本发明实施例的该方法, 可以充分利用高通量测序技术, 例如利用 Solexa测序技术, 同时对多种样品的 DNA标签文库进行测序, 从而能够提高 DNA标签文库测序的效率和通 量, 同时可以提高确定多种 DNA样品序列信息的效率, 并且相对于常规的多重核酸测序技 术, 能够有效减少数据产出偏向性的问题, 测序数据更加准确、 可靠。
根据本发明的又一方面, 本发明提供了一种用于构建 DNA标签文库的试剂盒。 根据本 发明的实施例, 该试剂盒包括: 6种分离的寡核苷酸, 所述分离的寡核苷酸具有第一链和第 二链, 所述第一链分别由 SEQ ID NO: ( 3N-1 )所示的核苷酸构成, 所述第二链分别由 SEQ ID NO: ( 3N )所示的核苷酸构成, 其中, 对于相同的寡核苷酸, 其第一链和第二链的 N取 值相同, 并且 N=l-6的任意整数, 其中, 所述 6种分离的寡核苷酸分别设置在不同的容器 中。 由此,利用该试剂盒,能够方便地将根据本发明实施例的 DNA标签引入到构建的 DNA 标签文库中。 从而, 根据本发明实施例的试剂盒能够有效地应用于多重核酸测序技术。
根据本发明的又一方面, 本发明提出了一组不同长度, 优选梯度长度的标签用于测序 标签文库的构建和 /或测序的用途, 其中所述标签是一段寡聚核苷酸序列, 优选是 2 - lOObp 的核苷酸序列。
根据本发明的一个实施例, 其中将所述标签包含在用于扩增目的序列的 PCR引物中, 从而构成各自相对应的标签 PCR引物, 通过 PCR方法引入待测序序列中, 所述 PCR标签 引物用作 PCR的 5'引物, 或 3 '引物, 或者同时用作 PCR的 5 '引物和 3 '引物。
根据本发明的一个实施例, 其中在所述标签 PCR引物中, 所述标签嵌入用于扩增目的 序列的 PCR引物中,或者通过或不通过连接子与用于扩增目的序列的 PCR引物的 5'端或 3 ' 端相连, 从而构成各自相对应的标签 PCR引物。
根据本发明的一个实施例, 其中将所述标签包含在标签文库的接头中, 从而构成各自 相对应的标签接头, 所述标签接头用作标签文库的 5'接头, 3 '接头,或者同时用作标签文库 的 5'接头和 3'接头。
根据本发明的一个实施例, 其中所述标签***接头中, 或通过或不通过连接子连接在 接头的末端, 优选地不通过连接子连接在接头的末端, 从而构成自相对应的标签接头。
根据本发明的一个实施例, 其中所述标签构成标签 PCR引物和标签接头, 同时用于测 序标签文库的构建和 /或测序。
根据本发明的又一方面, 本发明提出了一组不同长度, 优选梯度长度的标签, 其用于 测序标签文库的构建和 /或测序。
根据本发明的又一方面, 本发明提出了一组梯度标签, 所述一组梯度标签包括如下或 由如下组成: 表 1所示的 6个梯度标签或者与其相差 1个碱基的梯度标签中的至少 2个, 或至少 3个, 或至少 4个, 或至少 5个, 或全部 6个,
所述一组梯度标签优选地至少包括表 1所示的 6个梯度标签中的 Indexl和 Index2, 或 Index3和 Index4 , 或 Index5和 Index6 , 或者他们任何两个或多个的组合。
根据本发明的实施例, 其中所述相差 1个碱基包括对表 1所示的 6个梯度标签的序列 中 1个碱基的取代、 添加或缺失。
根据本发明的又一方面, 本发明提出了前面所述的梯度标签用于测序标签文库, 特别 是 Illumina/Solexa测序标签文库的构建和 /或测序的用途, 其中所述梯度标签包含在用于测 序标签文库, 特别是 IUumina/Solexa测序标签文库的接头 2的 5'末端中, 从而构成各自相 对应的梯度标签接头 2, 其用作测序标签文库, 特别是 Illumina/Solexa测序标签文库的 3' 接头。
根据本发明的实施例, 所述梯度标签包含在接头 2的 5'末端中, 包括所述梯度标签通 过或不通过连接子与接头 2的 5,末端相连, 或者***接头 2的 5'末端中, 优选的是不通过 连接子与接头 2的 5 '末端相连。
根据本发明的又一方面, 本发明提出了使用前述方法构建的测序标签文库, 特别是
Illumina/Solexa测序标签文库。
根据本发明的又一方面, 本发明提出了包含前面所述的梯度标签的一组梯度标签接头 2, 其在 5'末端包含权利要求 1 所述的梯度标签, 并且优选地用作测序标签文库, 特别是 Illumina/Solexa测序标签文库的 3 '接头, 所述一组梯度标签接头 2包括如下或由如下组成: 表 1所示的 6个梯度标签接头 2或者与其中包含的梯度标签序列相差 1个碱基的接头中的 至少 2个, 或至少 3个, 或至少 4个, 或至少 5个, 或全部 6个, 所述一组梯度标签接头 2优选地至少包括表 1所示的 6个梯度标签接头 2中的 Indexl adapter2 F/R和 Index2 adapter2 F/R,或 Index3 adapter2 F/R和 Index4 adapter2 F/R,或 Index5 adapter2 F/R和 Index6 adapter2 F/R, 或者他们任何两个或多个的组合。
根据本发明的实施例, 本发明提出了梯度标签接头 2, 其中所述相差 1个碱基包括对梯 度标签序列中 1个碱基的取代、 添加或缺失。
根据本发明的实施例, 本发明提出了梯度标签接头 2 用于测序标签文库, 特别是 Illumina/Solexa测序标签文库的构建和 /或测序的用途,所述梯度标签接头 2用作测序标签文 库, 特别是 Illumina/Solexa测序标签文库的 3 '接头。
根据本发明的实施例, 本发明提出了使用所述的梯度标签接头 2构建的测序标签文库, 特别是 Illumina/Solexa测序标签文库,其中所述梯度标签接头 2用作测序标签文库,特别是 Illumina/Solexa测序标签文库的 3 '接头。
根据本发明的又一方面, 本发明提出了一种构建测序标签文库, 特别是 Illumina/Solexa 测序标签文库的方法, 所述方法的特征在于使用一组具有不同长度, 优选梯度长度的标签 的接头用作测序标签文库, 特别是 Illumina/Solexa测序标签文库的 3'接头。
根据本发明的实施例, 构建测序标签文库的方法包括:
1 )提供 n个总基因组 DNA样品, 所述基因组 DNA样品来自任何真核生物样品, 包括 但不限于人的基因组 DNA样品;
2 )打断 DNA: 通过机械法打断 DNA, 产生带有粘性末端的 DNA片段, 所述机械法 包括但不限于使用 Bioruptor, Hydroshear和 Covaris;
3 )末端修复: 通过连接反应将 DNA片段的粘性末端补平;
4 )末端加 A: 通过连接反应在 DNA片段的平末端加上一个腺噤呤碱基 A;
5 )添加 5'接头和 3'接头;
6 )通过 PCR对目的片段进行扩增, 最后通过回收目的片段文库;
7 )混合: 当n>l B†, 将各样品的 PCR扩增产物混合在一起。
根据本发明的实施例,其中所述方法的特征在于使用不同的选自表 1的梯度标签接头 2 或者与其中包含的梯度标签序列相差 1 个碱基的接头用作测序标签文库, 特别是 Illumina/Solexa测序标签文库的 3 '接头。
根据本发明的实施例, 构建测序标签文库的方法包括:
1 )提供 n个总基因组 DNA样品, n为整数且 1≤η≤6 ,优选地 2≤η≤6 , 所述基因组 DNA 样品来自任何真核生物样品, 包括但不限于人的基因组 DNA样品;
2 )打断 DNA: 通过机械法打断 DNA, 产生带有粘性末端的 DNA片段, 所述机械法 包括但不限于使用 Bioruptor、 Hydroshear和 Covaris;
3 )末端修复: 通过连接反应将 DNA片段的粘性末端补平;
4 )末端加 A: 通过连接反应在 DNA片段的平末端加上一个腺嘌呤碱基 A;
5 )添加接头 1和梯度标签接头 2: 通过连接反应将接头 1和梯度标签接头 2与带有 A- 末端的 DNA片段进行连接; 6 )通过 PCR对目的片段进行扩增, 最后通过回收目的片段文库;
7 )混合: 当n>l B†, 将各样品的 PCR扩增产物混合在一起。
根据本发明的实施例, 其中所述接头 1包括如下接头: 5 '-TACACTCTTTCCCTACACG ACGCTCTTCCGATCTATCACT(SEQ ID NO:37)和 5,- /GTGATAGATCGGAAGAGCACAC GTCTGAACTCCAGTCAC(SEQ ID NO:38)。
根据本发明的实施例, 其中所述梯度标签接头 2包括表 1所示的 6个梯度标签接头 2 或与其中包含的梯度标签序列相差 1个碱基的接头中的至少 2个, 或至少 3个, 或至少 4 个, 或至少 5个, 或全部 6个,
所述一组梯度标签接头 2优选地至少包括表 1所示的 6个梯度标签接头 2中的 Indexl adapter2 F/R和 Index2 adapter2 F/ ,或 Index3 adapter2 F/R和 Index4 adapter2 F/R,或 Index5 adapter2 F/R和 Index6 adapter2 F/R, 或者他们任何两个或多个的组合。
根据本发明的实施例, 其中所述相差 1个碱基包括梯度标签序列中 1个碱基的取代、 添加或缺失。
根据本发明的实施例, 其中步骤 6 ) 中的 PCR使用如下 PCR引物:
PCR Primer 1 :
5'-AATGATACG
TCCGATCT(SEQ ID NO:39), 和
PCR Primer 2:
5'- CAAGCAGAJ
根据本发明的实施例, 其中步骤 6 )中回收目的片段文库是通过琼脂糖凝胶电泳以及切 胶回收进行。
根据本发明的实施例, 其进一步包括如下步骤;
8 )测序: 将各样品的 PCR扩增产物利用测序技术, 特别是 Illumina/Solexa测序技术进 行测序。
根据本发明的实施例, 其中利用测序技术进行测序中使用的测序 I物包括当构建 DNA
Pair-end 文 库 时 , 使 用 测 序 引 物 为 Sequencing Primer 1 : 5' -
AC ACTCTTTCCCTAC ACGACGCTCTTC CGATCT。
根据本发明的实施例, 本发明还提出了根据前述方法构建的测序标签文库, 特别是
Illumina/Solexa测序标签文库。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得 明显, 或通过本发明的实践了解到。 附图说明
本发明的上述和 /或附加的方面和优点从结合下面附图对实施例的描述中将变得明 显和容易理解, 其中:
图 1 : 显示了根据本发明实施例的构建 DNA标签文库的方法的流程示意图; 图 2: 显示了根据本发明实施例的利用 Illumina/Solexa测序平台对 DNA标签文库 进行测序时, 各测序反应的序列读取顺序及所读出序列的组成的示意图;
图 3:显示了根据本发明实施例的本发明的 DNA标签文库与对照例的 DNA标签文 库的前 10个 Illumina/Solexa测序循环的质量值的比较结果, 其中,
图 3A: 显示了实施例 1的本发明的 DNA标签文库的前 10个 Illumina/Solexa 测序循环的质量值,
图 3B: 显示了对照例的 DNA标签文库的前 10个 Illumina/Solexa测序循环的 质量值;
图 4:显示了根据本发明实施例的本发明的 DNA标签文库与对照例的 DNA标签文 库的各 Illumina/So lexa测序循环的光强的比较结果, 其中,
图 4A:显示了实施例 1的本发明的 DNA标签文库的各 IUumina/Solexa测序循 环的光强信号平均值,
图 4B:显示了对照例的 DNA标签文库的各 Illumina/Solexa测序循环的光强信 号平均值;
图 5 :显示了根据本发明实施例的本发明的 DNA标签文库与对照例的 DNA标签文 库的 Illumina/Solexa测序循环的碱基分布的比较结果, 其中,
图 5A:显示了实施例 1的本发明的 DNA标签文库的 Illumina/Solexa测序循环 的碱基百分比分布图,
图 5B:显示了对照例的 DNA标签文库的 Illumina/Solexa测序循环的碱基百分 比分布图; 以及
图 6:显示了根据本发明实施例的本发明的 DNA标签文库与对照例的 DNA标签文 库的 Illumina/Solexa测序循环的错误率的比较结果, 其中,
图 6A:显示了实施例 1的本发明的 DNA标签文库的 Illumina/Solexa测序循环 的错误率图,
图 6B:显示了对照例的 DNA标签文库的 Illumina/So lexa测序循环的错误率图。 发明详细描述
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相 图描述的实施例是示例性的, 仅用于解释本发明, 而不能理解为对本发明的限制。
需要说明的是, 在本发明的描述中, 除非另有说明, "多个" 的含义是两个或两个 以上。
DNA标签
根据本发明的一个方面, 本发明提供了一组分离的 DNA标签。 根据本发明的实施例, 这些分离的 DNA标签分别由 SEQ ID NO: ( 3N-2 )所示的核苷酸序列构成, 其中 N=l-6的 任意整数。 在本说明书中, 这些 DNA标签分别被命名为 IndexN, 其中 N=l-6的任意整数, 其序列如前面表 1所示, 在此不再赘述。
在本发明中所使用的术语 "DNA" 可以是任何包含脱氧核糖核苷酸的聚合物, 包括但 不限于经过修饰的或者未经修饰的 DNA。利用根据本发明实施例的 DNA标签,通过将 DNA 标签与样品的 DNA或其等同物相连, 得到具有标签的 DNA标签文库, 通过对 DNA标签 文库进行测序, 可以获得样品 DNA的序列以及标签的序列, 进而基于标签的序列可以精确 地表征 DNA的样品来源。 由此, 利用上述 DNA标签, 可以同时构建多种样品的 DNA标 签文库, 从而可以通过将来源于不同样品的 DNA标签文库进行混合, 同时进行测序, 基于 DNA标签对样品的 DNA序列进行分类, 获得多种样品的 DNA的序列信息。从而可以充分 利用高通量的测序技术, 例如利用 Solexa测序技术, 同时对多种样品的 DNA进行测序, 从 而提高了通过高通量测序技术的效率和通量, 降低了确定 DNA样品序列信息的成本。 这里 所使用的表述方式 "DNA标签与样品的 DNA或其等同物相连"应做广义理解,其包括 DNA 标签可以与样品的 DNA直接相连, 以构建 DNA标签文库, 也可以与和样品的 DNA具有 相同序列的核酸(例如可以是相应的 R A序列或 cDNA序列,其与 DNA具有相同的序列) 相连。
本申请的发明人发现: 在本发明中, 为了设计有效的 DNA标签, 首先需要考虑标签序 列之间的序列差异程度和碱基识别率。 其次, 在标签混合量少于 6个样品的情况下, 必须 考虑到混合后的标签上的每个碱基位点的 GT含量。 因为 Solexa测序过程中, 戚基 G和 T 的激发荧光一样,碱基 A和 C的激发光是一样的,因此必须考虑碱基" GT"含量与碱基" AC" 含量的 "平衡 ", 最适碱基 "GT"含量为 50%, 能保证标签识别率最高和错误率最低。最后, 还要考虑数据产出的可重复性和准确性, 即为了实现能够有效构建 DNA标签文库并进行测 序,所构建的一组 DNA标签需要能够保证结果可靠,可重复性高,也就是针对同样的 DNA 样品, 可以保证利用该组 DNA标签中的不同标签构建的 DNA标签文库, 能够获得一致的 测序结果, 因而可以确保实验结果可靠且重复性高。 另外, 还需要同时避免标签序列出现 3 或 3个以上连续的碱基的出现, 因为 3个或 3个以上连续的碱基会增加序列在合成过程中 或测序过程中的错误率, 同时也要尽量避免 DNA标签接头和 PCR标签引物自身形成发夹 结构。
为此, 本申请的发明人进行了大量的筛选工作, 并且选定了根据本发明实施例的一组 分离的 DNA标签, 其分别由 SEQ ID NO: ( 3N-2 )所示的核苷酸序列构成, 其中 N=l-6的 任意整数。 其序列如前面表 1所示, 不再赘述。 根据本发明的实施例, 该组分离的 DNA标 签为梯度标签,具体地, Indexl为 6bp, Index2为 7bp, Index 1为 8bp,则 Indexl-3具有 lbp 递增的梯度, 而同样, Index4为 6bp, Index5为 7bp, Index6为 8bp, 则 Index4-6具有 lbp 递增的梯度, 由此, 能够有效减少标签中碱基的偏向性, 进而将这些 DNA标签应用到多重 核酸测序技术中时, 能够有效地减少测序过程中的光强参数的波动, 从而能够显著提高输 出数据的质量, 进一步, 能够显著提高测序数据的可靠性、 稳定性和可重复性。 此外, 各 DNA标签之间的差异在 5个碱基以上, 当 DNA标签的碱基中的任意 1个碱基出现测序错 误或合成错误, 都不影响到标签的最终识别。 这些标签可以应用于任何 DNA标签文库的构 建。目前尚未有关于这些标签应用于 DNA样品测序的文库构建并通过 Solexa技术测序的报 道。
根据本发明的一些实施例, 所釆用的一组 DNA标签由如下组成: 表 1所示 DNA标签 或与之相差 1个碱基的 DNA标签中的至少 2个, 或至少 3个, 或至少 4个, 或至少 5个, 或全部 6个。 具体地, 才 据本发明的实施例, 所述一组 DNA标签优选地至少包括表 1所示 的 6个 DNA标签的 Index 1和 Index2, 或 Index3和 Index4, 或 Index5和 Index6, 或者它们 任何两个或多个的组合。 在本发明的一些具体示例中, 其中所述相差 1 个碱基包括对表 1 所示的 6个 DNA标签的序列中 1个碱基的取代、 添加或缺失。
根据本发明的实施例,本发明还提供了将根据本发明实施例的 DNA标签在多重核酸测 序技术中的用途, 具体地, 即 DNA标签在 DNA标签文库构建及测序的用途。 其中, 可以 将本发明的 DNA标签包含在 DNA标签文库的接头中, 从而构成各自相对应的 DNA标签 接头, 该 DNA标签接头用作 DNA标签文库的 5'接头, 3'接头,或者同时用作 DNA标签文 库的 5'接头和 3'接头, 由此, 能够通过连接反应将 DNA标签引入到 DNA标签文库中。 根 据该用途的实施例, DNA标签*** DNA标签接头的末端中, 或通过或不通过连接子连接 在 DNA接头的末端, 优选地不通过连接子连接在 DNA标签接头的末端中, 从而构成各自 相对应的 DNA标签接头。此外,可以将本发明的 DNA标签包含在用于扩增目的序列的 PCR 引物中,从而构成各自相对应的 PCR标签引物,该 PCR标签引物用作 PCR反应的 5'引物, 或 3'引物, 或者同时用作 PCR反应的 5'引物和 3'引物, 由此, 能够通过 PCR方法将 DNA 标签引入到 DNA标签文库中。 根据该用途的实施例, DNA标签嵌入 PCR标签引物中, 或 者通过或不通过连接子与用于扩增目的序列的 PCR引物的 5'末端或 3'末端相连, 从而构成 各自相对应的 PCR标签引物。
根据本发明的一个具体示例,本发明提供了前面所述的 DNA标签用于 DNA标签文库, 特别是 Illumina/Solexa测序 DNA标签文库的构建和 /或测序的用途。在本发明提供的所述用 途中, DNA标签包含在 DNA标签文库, 特别是 Illumina/Solexa测序 DNA标签文库的 3 ' 接头的 5,末端中, 从而构成各自相对应的 DNA标签接头, 其用作 DNA标签文库, 特別是 Illumina/Solexa测序 DNA标签文库的 3,接头。
根据本发明的一个具体示例,本发明提供了前面所述的 DNA标签用于 DNA标签文库, 特别是 Illumina/Solexa测序 DNA标签文库的构建和 /或测序的用途。 其中 DNA标签包含在 DNA标签文库,特别是 IUumina/Solexa测序 DNA标签文库的 3 '接头的 5 '末端中,其中 DNA 标签通过或不通过连接子与该 3'接头的 5'末端相连, 或者***该 3'接头的 5'末端中。 优选 的是不通过连接子与 3'接头的 5'末端相连。 其中所述连接子是 1 - 10个碱基的序列, 优选 地 1 - 5个碱基的序列, 更优选 1 - 3个碱基的序列。
寡核苷酸、 PCR标签引物以及构建 DNA标签文库的方法
根据本发明的叉一方面, 本发明提供了一组分离的寡核苷酸, 其可以用于将前面所述 的 DNA标签引入到样品的 DNA中, 进而构建 DNA标签文库。 根据本发明的实施例, 该 分离的寡核苷酸具有第一链和第二链, 其中第一链分别由 SEQ ID NO: ( 3N-1 )所示的核苷 酸构成, 第二链分别由 SEQ ID NO: ( 3N )所示的核苷酸构成, 其中,对于相同的寡核苷酸, 其第一链和第二链的 N取值相同,并且 N=l-6的任意整数。在本文中所使用的表达方式 "对 于相同的寡核苷酸, 其第一链和第二链的 N取值相同" 是指, 釆用序列表中的相应核苷酸 分别作为第一链和第二链时, 构成第一链的核苷酸与构成第二链的核苷酸能够形成稳定的 二聚体, 具体地, 例如当 N=3时, 采用 SEQ ID NO: 8作为第一链, SEQ ID NO: 9作为第 二链。 本领域技术人员能够理解, 可以通过分別将构成相应寡核苷酸的第一链与第二链进 行退火处理,而形成相应的寡核苷酸。在本说明书中,与 DNA标签相对应的寡核苷酸(DNA 标签接头)分别被命名为 IndexN adapter, 其中 N=l-6的任意整数, 其第一链和第二链分别 被命名为 IndexN adapter F和 IndexN adapter R, 其中 N=l-6的任意整数, 其序列如前面表 1 所示, 在此不再赘述。 根据本发明的实施例, 上述寡核苷酸分别具有如前所述的根据本发 明实施例的 DNA标签, 因而可以通过连接反应, 将相应的 DNA标签引入到样品的 DNA 或其等同物中。 具体地, 这些寡核苷酸的序列如前面表 1所示, 在此不再赘述。
具体地, 根据本发明的实施例, 该组 DNA标签接头, 其在 5'末端含有本发明的 DNA 标签, 并且优选地用作 DNA标签文库, 特别是 Illumina/Solexa测序 DNA标签文库的 3'接 头。 根据本发明的一些实施例, 该组 DNA标签接头包括或由如下组成: 表 1所示的 6个 DNA标签接头或者与其包含的 DNA标签序列相差 1个械基的 DNA标签接头中的至少 2个, 或至少 3个, 或至少 4个, 或至少 5个, 或全部 6个。 根据本发明的具体示例, 这些 DNA 标签接头优选地至少包括表 1所示的 6个 DNA标签接头 2中的 Indexl adapter F/R和 Index2 adapter F/R, 或 Index3 adapter F/R和 Index4 adapter F/R, 或 Index5 adapter F/R和 Index6 adapter F/R, 或者他们任何两个或多个的组合。 在本发明的一些具体示例中, 前面所述的 DNA标签接头中所述相差 1个碱基包括对表 1所示的 6个 DNA标签的序列中 1个碱基的 取代、 添加或缺失。
根据本发明的一个具体示例, 本发明提供了 DNA标签接头用于 DNA标签文库, 特别 是 Illumina/Solexa测序 DNA标签文库构建和 /或测序的用途,所述 DNA标签接头用作 DNA 标签文库, 特别是 Illumina/Solexa测序 DNA标签文库的 3'接头。
根据本发明的另一方面, 本发明提供了一组分离的 PCR标签引物, 其可以用于将前面 所述的 DNA标签引入到样品的 DNA中, 进而构建 DNA标签文库。 #>据本发明的实施例, 该组分离的 PCR标签引物具有第一链和第二链, 其中第一链分别由 SEQ ID NO: 39所示 的核苷酸构成, 第二链的核苷酸序列如下所示: CAAGCAGAAGACGGCATACGAGATXXX XXXGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT, 其中, XXXXXX为才艮据本发 明实施例的一组分离的 DNA标签的一种。 根据本发明的实施例, 本发明的 PCR标签引物 分别具有如前所述的根据本发明实施例的 DNA标签,通过采用 PCR标签引物的 PCR反应, 能够将 PCR标签引物引入样品的 DNA或其等同物中, 进而将相应的 DNA标签引入到 DN A或其等同物中, 进而能够将 DNA标签引入到 DNA标签文库中,从而通过各样品的 DNA 标签文库进行混合测序能够实现对多种样品的测序, 并能够有效地减少数据产出偏向性的 问题, 获得的测序数据可靠、 稳定, 可重复性好。 根据本发明的一些实施例, 本发明的 PC R标签引物,可以用作 PCR反应的 5 '引物,或 3'引物, 或者同时用作 PCR反应的 5'引物和 3'引物, 由此, 能够通过 PCR方法将 DNA标签引入到 DNA标签文库中。
进一步, 根据本发明的实施例, 还提供了使用上述 DNA标签接头和 PCR标签引物构 建的 DNA标签文库。
根据本发明的另一方面, 本发明还提供了一种利用上述 DNA标签接头构建 DNA标签 文库的方法。 具体地, 根据本发明的实施例, 参考图 1, 该方法可以包括:
首先, 将 DNA样品片段化, 以便获得 DNA片段。 根据本发明的实施例, 将 DNA样 品片段化的方法不受特别限制。 根据本发明的一些实施例, 片段化是通过选自雾化法、 酶 切法和超声打断法的至少一种进行的。根据本发明的一些具体示例,利用超声打断法将 DNA 样品片段化。 根据本发明的一个具体示例, 利用 Covaris打断仪将 DNA样品片段化。 根据 本发明的实施例, 获得的 DNA片段的长度可以为大约 500bp, 由此, 能够进一步提高构建 DNA标签文库以及后续测序的效率。根据本发明的实施例, DNA样品的来源并不受特别限 制。 根据本发明的一些实施例, DNA样品可以来源于任何常见的生物样品, 例如植物、 动 物或微生物。 根据本发明的另一些实施例, DNA样品可以来源于选自拟南芥、 水稻、 人、 小鼠和大肠杆菌的至少一种。 从而, 利用根据本发明实施例的方法, 能够有效地构建多种 常见模式生物的 DNA标签文库。 才艮据本发明的具体示例, DNA样品可以为人 DNA样品, 更具体地, 可以为人基因组 DNA样品。
其次, 将 DNA片段进行末端修复, 以便获得经过末端修复的 DNA片段。 根据本发明 的实施例, 将 DNA片段进行末端修复的方法不受特别限制。 根据本发明的一个具体示例, 末端修复是利用 Klenow片段、 T4 DNA聚合酶和 T4多核苷酸激酶进行的, 其中该 Klenow 片段具有 5 '→ 3 '聚合酶活性和 3 '→ 5 '外切酶活性, 但缺少 5 '→ 3 '外切酶活性。
接着,将经过末端修复的 DNA片段进行 3'末端添加碱基 A, 以便获得 3 '末端添加碱基 A的 DNA片段。 根据本发明的实施例, 将经过末端修复的 DNA片段进行 3 '末端添加碱基 A的方法不受特别限制。 居本发明的具体示例, 可以利用 Klenow (3 '-5' exo-)将经过末端 修复的 DNA片段进行 3,端添加碱基 A。 根据本发明的实施例, 将经过末端修复的 DNA片 段的两条寡核苷酸链的 3'末端均添加碱基 。
接下来,将 3'末端添加碱基 A的 DNA片段与 DNA标签接头相连,以便获得连接产物, 其中 DNA标签接头包含根据本发明实施例的一组分离的 DNA标签的一种。 根据本发明的 实施例, 将 3'末端添加碱基 A的 DNA片段与 DNA标签接头相连的方法不受特别限制。 根 据本发明的一些具体示例,将 3'末端添加碱基 A的 DNA片段与 DNA标签接头相连是利用 T4 DNA连接酶进行的。根据本发明的实施例,在 3'末端添加碱基 A的 DNA片段的 3 '末端 和 5'末端的至少一处连接 DNA标签接头。根据本发明一些具体示例,在 3'末端添加碱基 A 的 DNA片段的 3'末端连接 DNA标签接头。根据本发明的实施例, DNA标签接头可以为选 自前面所述的根据本发明实施例的分离的寡核苷酸的一种。 然后, 将得到的连接产物进行扩增, 以便获得扩增产物。 根据本发明的实施例, 将连 接产物进行扩增之前, 进一步包括对连接产物进行片段选择的步驟。 根据本发明的一些实 施例, 可以利用 2%琼脂糖凝胶电泳进行片段选择。 根据本发明的一个具体示例, 经过片段 选择的连接产物的长度可以为约 620bp。 根据本发明的实施例, 将连接产物进行:扩增, 是通 过 PCR反应进行的, 该 PCR反应采用分别具有如 SEQ ID NO: 39和 SEQ ID NO: 40 所示 核苷酸序列的引物。
最后, 分离回收获得的扩增产物, 该扩增产物构成 DNA标签文库。 根据本发明的实施 例, 分离回收扩增产物的方法也不受特别限制, 本领域技术人员可以根据扩增产物的特点 选择适当的方法和设备进行分离, 例如可以通过电泳并且回收特定长度的 PCR扩增产物的 方法进行回收。 根据本发明的实施例, 可以通过选自琼脂糖凝胶电泳、 磁珠纯化和纯化柱 纯化的至少一种分离扩增产物。 根据本发明的具体示例, 分离回收的扩增产物的长度可以 为大约 620bp。
利用根据本发明实施例的构建 DNA标签文库的方法, 能够有效地将根据本发明实施例 的 DNA标签引入到针对 DNA样品所构建的 DNA标签文库中。从而可以通过对 DNA标签 文库进行测序, 获得 DNA样品的序列信息以及 DNA标签的序列信息, 从而能够对 DNA 样品的来源进行区分。本发明提供的构建 DNA标签文库的方法能够充分的利用高通量测序 平台, 满足高通量测序的需求, 节省测序资源, 从而降低测序成本, 并且相对于常规的多 重核酸测序技术, 能够有效地减少数据产出偏向性的问题, 获得的测序数据可靠、 稳定, 可重复性好。
具体地, 根据本发明的实施例, 本发明的构建 DNA标签文库的方法可以包括:
1 )提供 n个基因组 DNA样品, 其中基因组 DNA样品可以来自任何真核生物样品, 包 括但不限于人的基因组 DNA样品;
2 )打断 DNA, 可以通过机械法打断 DNA, 产生带有粘性末端的 DNA片段, 所述机 械法包括但不限于使用 Bioruptor、 Hydroshear和 Covaris;
3 )末端修复, 通过连接反应将 DNA片段的粘性末端补平, 以便获得经过末端修复的
DNA片段;
4 )末端加 A, 通过连接反应在经过末端修复的 DNA片段的平末端加上一个腺嘌呤碱 基 A, 以便获得 3 '末端添加碱基 A的 DNA片段;
5 )添加 5'接头和 3'接头, 将接头 1和 DNA标签接头分别作为 DNA标签文库的 5'接 头和 3'接头, 并将 3'末端添加碱基 A的 DNA片段与接头 1和 DNA标签接头相连, 以便获 得连接产物, 其中 DNA标签接头包含根据本发明实施例的一组分离的 DNA标签的一种;
6 )通过 PCR对连接产物进行扩增,并通过琼脂糖凝胶电泳以及切胶分离回收扩增产物, 该扩增产物构成 DNA标签文库;
7 )混合: 当n>l时, 将各样品的 DNA标签文库混合在一起, 以便获得 DNA标签文库 混合物。
进一步,根据本发明的一个具体示例,本发明的构建 DNA标签文库的方法还可以包括: 1 )提供 n个基因组 DNA样品, n为整数且 1 < n < 6,优选地 2 < n < 6,其中基因组 DNA 样品可以来自任何真核生物样品, 包括但不限于人的基因组 DNA样品;
2 )打断 DNA, 可以通过机械法打断 DNA, 产生带有粘性末端的 DNA片段, 所述机 械法包括但不 P艮于使用 Bioruptor、 Hydroshear和 Covaris;
3 )末端修复, 通过连接反应将 DNA片段的粘性末端补平, 以便获得经过末端修复的
DNA片段;
4 )末端加 A, 通过连接反应在经过末端修复的 DNA片段的平末端加上一个腺噤呤碱 基 A, 以便获得 3,末端添加碱基 A的 DNA片段;
5 )连接 5 '接头和 DNA标签接头: 将 DNA标签接头作为 DNA标签文库的 3 '接头, 通 过连接反应将 5'接头和 DNA标签接头与 3'末端添加碱基 A的 DNA片段进行连接,以便获 得连接产物, 其中 DNA标签接头包含根据本发明实施例的一组分离的 DNA标签的一种;
6 )通过 PCR对连接产物进行扩增,并通过琼脂糖凝肢电泳以及切肢分离回收扩增产物, 该扩增产物构成 DNA标签文库;
7 )混合: 当^>1时, 将各样品的 DNA标签文库混合在一起, 以便获得 DNA标签文库 混合物。
根据本发明的一个实施例, 上述本发明的构建 DNA标签文库的方法中使用的 Illumina/ Solexa测序文库的 5,接头为接头 1 , 其序列如下所示:
5'-TACACTCTTTCCCTACACGACGCTCTTCCGATCTATCACT(SEQ ID NO:37);
5'-GTGATAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC(SEQ ID NO:38)。 根据本发明的一个实施例, 上述本发明的构建 DNA标签文库的方法中使用的 DNA标 签接头包括或由如下组成: 表 1所示的 6个 DNA标签接头或者与其包含的 DNA标签序列 相差 1个碱基的 DNA标签接头中的至少 2个, 或至少 3个, 或至少 4个, 或至少 5个, 或 全部 6个。 根据本发明的具体示例, 这些 DNA标签接头优选地至少包括表 1所示的 6个 DNA标签接头 2中的 Index 1 adapter F/R和 Index2 adapter F/R, 或 Index3 adapter F/R和 Index4 adapter F/R,或 Index5 adapter F/R和 Index6 adapter F/R,或者他们任何两个或多个的 组合。 在本发明的一些具体示例中, 前面所述的 DNA标签接头中所述相差 1个碱基包括对 表 1所示的 6个 DNA标签的序列中 1个械基的取代、 添加或缺失。
根据本发明的一个实施例, 上述本发明的构建 DNA标签文库的方法中的 PCR反应采 用如下 PCR引物:
PCR Primer 1:
ATCT(SEQ ID NO:39), 和
PCR Primer 2: 根据本发明的一个实施例, 本发明的构建 DNA标签文库的方法可以进一步包括:
8 )测序: 利用测序技术, 特别是 Illumina/Solexa测序技术, 将各样品的 DNA标签文 库混合物进行测序。
根据本发明的一个具体示例,可以将本发明的一组不同长度的 DNA标签分别包含在用 于扩增目的序列的 PCR引物中, 从而构成各自相对应的 PCR标签引物, 由此, 本发明的构 建 DNA标签文库的方法可以包括: 首先将总 DNA样品利用机械法或酶切法打断成一定长 度的 DNA片段, 并在将 DNA片段与接头连接后, 再通过 PCR标签引物对含有 DNA片段 (即前面所述的目的序列) 的连接产物进行扩增, 最后通过琼脂糖电泳及切胶回收扩增产 物, 从而获得扩增产物构成的 DNA标签文库。
根据本发明的一个具体示例,可以将本发明的一组不同长度的 DNA标签分别嵌入现有 文库的接头(例如末端)中, 从而构成各自相对应的 DNA标签接头, 由此, 本发明的构建 DNA标签文库的方法可以包括: 首先将总 DNA样品利用机械法或酶切法打断成一定长度 的 DNA片段, 并在将 DNA片段进行末端修复并形成随机的粘性末端后, 将其与 DNA标 签接头进行连接, 以便获得连接产物, 然后通过特定的 PCR引物对含有 DNA片段(即前 面所述的目的序列) 的连接产物进行扩增, 最后通过琼脂糖电泳及切胶回收扩增产物, 从 而获得扩增产物构成的 DNA标签文库。
根据本发明的一个具体示例,可以将本发明的一组不同长度的 DNA标签分别包含在用 于扩增目的序列的 PCR引物中,从而构成各自相对应的 PCR标签引物, 同时将本发明的一 组不同长度的 DNA标签分别嵌入现有文库的接头(例如末端)中, 从而构成各自相对应的 DNA标签接头, 由此, 本发明的构建 DNA标签文库的方法可以包括: 首先将总 DNA样品 利用机械法或酶切法打断成一定长度的 DNA片段, 然后将 DNA片段与 DNA标签接头进 行连接, 以便获得连接产物, 再通过 PCR标签引物对含有 DNA片段(即前面所述的目的 序列)的连接产物进行才广增, 最后通过琼脂糖电泳及切胶回收 增产物, 从而获得扩增产 物构成的 DNA标签文库。
根据本发明的实施例, 本发明的提供的构建 DNA标签文库, 特别是 Illumina/Solexa测 序 DNA标签文库的方法, 使用选自表 1所示的本发明的 DNA标签接头用作 DNA标签文 库, 特别是 Illumina/Solexa测序 DNA标签文库的 3,接头。
根据本发明的再一方面, 本发明还提供了一种用于构建 DNA标签文库的试剂盒。根据 本发明的实施例, 该试剂盒包括: 6种分离的寡核苷酸, 这些分离的寡核苷酸具有第一链和 第二链,其第一链分别由 SEQ ID NO: ( 3N-1 )所示的核苷酸构成,第二链分别由 SEQ ID NO: ( 3N )所示的核苷酸构成, 其中, 对于相同的寡核苷酸, 其第一链和第二链的 N取值相同, 并且 N=l-6的任意整数,其中, 6种分离的 DNA标签接头分别设置在不同的容器中。 由此, 利用该试剂盒, 能够方便地将根据本发明实施例的 DNA标签引入到构建的 DNA标签文库 中。 当然, 本领域技术人员能够理解, 试剂盒中还可以包含其他用于构建 DNA标签文库的 常规组件, 在此不再赘述。
DNA标签文库及测序方法
根据本发明的又一方面, 本发明还提供了一种 DNA标签文库, 其是根据本发明的构建
DNA标签文库的方法所构建的。该具有根据本发明实施例的 DNA标签的 DNA标签文库可 以有效地应用于高通量测序技术例如 Solexa技术, 从而可以通过获得标签序列, 来对所获 得的核酸序列信息例如 DNA序列信息来精确地进行样品来源分类,并且所得的测序数据准 确、 可靠, 稳定性和可重复性好。 此外, 根据本发明的一些具体示例, 本发明的 DNA标签 文戽采用 DNA标签接头作为 DNA标签文库, 特别是 IUumina/Solexa测序标签文库的 3'接 头。
根据本发明的又一方面, 本发明还提供了一种确定 DNA样品序列信息的方法。 根据本 发明的实施例, 其包括: 根据本发明实施例的构建 DNA标签文库的方法, 构建 DNA标签 文库; 接着, 对所构建的 DNA标签文库进行测序, 以便确定 DNA样品的序列信息。 基于 该方法, 能够有效地获得 DNA标签文库中 DNA样品的序列信息以及 DNA标签的序列信 息, 从而能够对 DNA样品的来源进行区分。 另外, 发明人惊奇地发现, 利用根据本发明实 施例的方法确定 DNA样品序列信息, 能够有效地减少数据产出偏向性的问题, 并且能够精 确地对多种 DNA标签文库进行区分, 获得的测序数据准确、 可靠, 稳定性和可重复性好。 根据本发明的实施例, 可以釆用任何已知的方法对所构建的 DNA标签文库进行测序, 其类 型并不受特别限制。根据本发明的一些示例,可以利用 Solexa测序技术对 DNA标签文库进 行测序。 根据本发明的实施例, 可以根据具体情况选择合适的测序引物进行测序。
进一步, 可以将上述确定 DNA样品序列信息的方法应用于多种样品。 例如, 根据本发 明的实施例,本发明提供了一种确定多种 DNA样品序列信息的方法。根据本发明的实施例, 其包括以下步驟: 针对多种样品的每一种, 分别独立地才 据本发明的实施例的构建 DNA标 签文库的方法, 构建该 DNA样品的 DNA标签文库, 其中, 不同的 DNA样品采用相互不 同并且已知序列的 DNA标签; 将多种样品的 DNA标签文库进行组合, 以便获得 DNA标 签文库混合物; 对 DNA标签文库混合物进行测序, 以便获得 DNA样品的序列信息以及所 述标签组合的序列信息; 以及基于标签组合的序列信息对 DNA样品的序列信息进行分类, 以便确定多种样品的 DNA序列信息。 这里, 所使用的术语 "多种" 为至少 2种。 根据本发 明的实施例, 对 DNA标签文库混合物进行测序的方法不受特别限制。 根据本发明的一些具 体示例,可以利用 Solexa测序技术对所得的 DNA标签文库混合物进行测序,从而获得 DNA 样品的序列信息以及标签的序列信息。 由此, 根据本发明实施例的该方法, 可以充分利用 高通量的测序技术, 例如利用 Solexa测序技术, 同时对多种样品的 DNA文库进行测序, 从 而能够提高 DNA文库测序的效率和通量, 并且能够有效地减少数据产出偏向性的问题, 所 得的测序数据准确、 可靠, 稳定性和可重复性好。 需要说明的是, 根据本发明实施例的 DNA标签及其应用, 是本申请的发明人经过艰苦 的创造性劳动和优化工作才完成的。 下面将结合实施例对本发明的方案进行解释。 本领域技术人员将会理解, 下面的实施 例仅用于说明本发明, 而不应视为限定本发明的范围。 实施例中未注明具体技术或条件的, 按照本领域内的文献所描述的技术或条件 (例如参考 J.萨姆布鲁克等著, 黄培堂等译的《分 子克隆实验指南》, 第三版, 科学出版社)或者按照产品说明书进行。 所用试剂或仪器未注 明生产厂商者, 均为可以通过市购获得的常规产品, 例如可以釆购自 Ilhimina公司。
对照例:
利用 Multiplexing Sample Preparation Oligonucleotide试剂盒( PE-400-1001 ), 参照试剂 盒的标准流程, 采用下表 2中的 6种常规标签( PE IndexN, 其中 N=l-6的任意整数)及其 对应的常规标签接头( PE IndexN adapter, 包括正义序列 PE IndexN adapter F和反义序列 PE IndexN adapter R, 其中 N=l-6的任意整数 )构建 DNA标签文库:
表 2 常规标签( PE IndexN )和常规标签接头( PE IndexN adapter )序列(表中所示序列方向均是 5'
- 3'方向)
Figure imgf000018_0001
1、 DNA样品检测
首先,取 1〜2μδ人类外周血基因组 DNA样品,进行样品检测。具体地,使用 NanoDrop 1000检测 DNA样品的浓度、 OD260/280比值和 OD260/230比值, 并利用琼脂糖凝胶电泳 检测样品的完整性, 检测合格后, 备用。
其中, DNA样品质量合格的标准为:
样品浓度: 样品的浓度最低不应低于 lOOng/μΙ; 样品纯度: OD260/280比值应在 1.8 ~ 2.0之间, 没有蛋白、 多糖和 R A污染; 样品完整性: 电泳结果显示 DNA样品应没有降解;
然后, 取适量检测合格的 DNA样品, 按以下步骤构建 DNA标签文库, 其中, 为保证 文戽制备的质量, 要求样品总量不低于 45μδ:
2、 片段化
将上述检测合格的 DNA样品片段化, 以便获得 DNA片段。 其中, 将 DNA样品片段 化的方法有两种, 分别是雾化法(Nebulization )和 Covaris打断法, 均可将样品 DNA打碎 至 100~800 b 范围的片段且主带在 500bp左右(若样品为已打断的 DNA,则可以跳过此步)。 本对照例采用雾化法将 DNA样品片段化,获得的 DNA片段的长度为 100-800bp,主带 500bp。
3、 末端修复
1 )按照下表中的配比, 在 1.5 ml的离心管中配制末端修复反应体系:
Figure imgf000019_0001
2 )将上述配制的末端修复反应体系在 Thermomixer中, 于 20°C下进行温浴 30 min, 以 便获得经过末端修复的 DNA片段。
3 )其后使用 QIAquick PCR纯化试剂盒(Qiagen ), 将经过末端修复的 DNA片段进行 柱纯化, 并溶于 34μ1的洗脱緩冲液( Elution Buffer ) 中, 备用。
4、 3'末端添加碱基 A
1 )按照下表' 的配比, 在 1.5 ml的离心管中配制 3,末端添加碱基 A的反应体系
Figure imgf000019_0002
2)将上述配制的 3'末端添加碱基 Α的反应体系,在 Thermomixer中,于 37°C下进行温 浴 30 min, 以便获得 3,末端添加减基 A的 DNA片段。
3)其后使用 QIAquick PCR纯化试剂盒( Qiagen ), 将 3'末端添加碱基 A的 DNA片段 进行柱纯化, 并溶于 12μ1的洗脱緩冲液中, 备用。 5、 连接接头 (Adapter )
1 )按照下表中的配比, 在 1.5 ml的离心管中配制连接接头的反应体系:
Figure imgf000020_0001
常规标签接头混合物, 是通过将上表 2所示的相对应的有义序列 PE IndexN adapter F 和反义序列 PE IndexN adapter R经退火处理而形成的常规标签接头, 进行组合获得的。
2 )将上述配制的连接接头的反应体系,在 Thermomixer中,于 20 °C下进行温浴 15 min, 以便获得连接产物。
3 )其后使用 QIAquick PCR纯化试剂盒(Qiagen ),将连接产物进行柱纯化,并溶于 30μ1 的洗脱緩冲液中, 备用。
6、 片段选择
1 ) 将上述获得的连接产物于 2%琼脂糖凝胶上, 100V下进行电泳 120 min;
2 )切取 620bp (切胶的片段长度可通过 n+120bp计算, 其中 n=***片段大小)位置的 胶块;
3 )其后使用 QIAquick凝胶提取试剂盒( Qiagen )回收切取的胶块, 以便获得目的片段, 然后将其溶于 40μ1的洗脱緩冲液中, 备用。
7、 PCR反应
1 )按照下表中的配比, 在0.2 011的?0 管中配制?01反应体系:
Figure imgf000020_0002
其中, 该 PCR反应采用如下的 PCR引物:
PCR Primer 1: ATCT(SEQ ID NO:39), 和
PCR Primer 2: 2 ) 然后, 将上述配制的 PCR反应体系, 在热循环仪中运行下列程序: PE文库
98 °C 30s
98 °C 10s "1
65 °C 30s Γ 10个循环
72 °C 30s或 50s J
72 °C 5min
4°C oo
由此, 获得扩增产物, 备用。
8、 分离回收扩增产物
将扩增产物在 2%琼脂糖凝胶上, 以 100V进行电泳 120 min, 然后切取 620bp (切胶的 片段长度可通过 n+120bp计算, 其中 n ***片段大小)位置的胶块, 其后使用 QIAquick 凝 胶提取试剂盒(Qiagen ) 回收切取的胶块, 以便获得 DNA标签文库, 然后将其溶于 40μ1 的洗脱緩冲液中, 备用。 实施例 1:
利用 Multiplexing Sample Preparation Oligonucleotide试剂盒( PE-400-1001 ), 参照试剂 盒的标准流程, 釆用表 1中的根据本发明实施例的 6种 DNA标签及其对应的 DNA标签接 头, 构建 DNA标签文库, 具体步骤同对照例, 其中, 不同之处在于:
步驟 5, 使用 DNA标签接头混合物 ( Index Adapter Oligo mix )代替常规标签接头混合 物 (PE Index Adapter Oligo Mix ), 其中, DNA标签接头混合物, 是通过将表 1所示的有义 序列 IndexN adapter F和反义序列 IndexN adapter R经退火处理而形成的 DNA标签接头,进 行组合获得的。
步驟 7, PCR反应的程序为:
98 °C 30s
98 °C 10s
65 °C 30s 12个循环
72 °C 30s或 50s
72 °C 5min
4°C 00
由此, 获得 DNA标签文库, 备用。 实施例 2:
利用 HiSeq2000测序仪,严格按照仪器推荐的流程,对对照例及实施例 1所构建的 DNA 标签文库进行测序, 以便获得测序数据。 其中, 所采用的测序引物为:
Sequencing Primer 1 : 5' - ACACTCTTTCCCTACACG ACGCTCTTCCGATCT ( SEQ ID NO: 41 )
图 2则显示了利用 Solexa测序平台对 DNA标签文库进行测序时,各 Reads的序列 读取顺序及其序列组成的示意图。如图 2所示,其中 Readl表示测序反应 1所测出来的 序列, Read 1 Seq Primer表示所釆用的测序引物。
然后, 利用数据处理软件 HiSeq Control Software对获得的对照例及实施例 1所构建的
DNA标签文库的测序数据, 进行处理、 分析和比较, 结果见图 3-图 6。 其中数据处理软件 包括但不限于 HiSeq Control Software ( HCS )、 Pipeline. CASAVA、 SOAP和 ELAND。
图 3显示了实施例 1的本发明的 DNA标签文库与对照例的 DNA标签文库的前 10 个测序循环 (cycle ) 的质量值(Q20 ) 的比较结果, 其中, A显示了实施例的本发明的 DNA标签文库的前 10个测序循环的质量值, B显示了对照例的 DNA标签文库的前 10 个测序循环的质量值。 如图 3所示, 横坐标表示循环数, 纵坐标表示质量情况。 其中, 质量值(Q-Vahie )可以反映测序质量, 且其介于 0-40之间, 在此范围内, 数值越高表 示质量越好。 而 Q20是指质量值大于 20的碱基在所有碱基中所占的比例, 可以反映测 序出来的序列质量好坏, 数值越接近 1 , 说明测序质量越好。 由图 3可知, 将本发明的 6 条呈梯度长度变化的 DNA标签嵌入接头中, 构建本发明的 DNA标签文库, 并使用 Illumina/Solexa技术对 DNA标签文库进行测序时, 该本发明的 DNA标签文库的前 10个 测序循环的质量值 Q20—直都维持在 0.9; 而将 6bp固定长度的常规标签嵌入接头中, 构 建对照例的 DNA标签文库, 并使用 Illumina/Solexa技术对 DNA标签文库进行测序, 在测 第一个***片段碱基即第 7个循环时, 由于 <¾分布的变化, 在该循环处出现了明显的质 量下降现象, 且该对照例的 DNA标签文库的前 10个测序循环的质量值 Q20为 0.8。 则 图 3表明, 两个 DNA标签文库的前 10个测序循环 (cycle ) 的质量值存在明显差异, 且 本发明的 DNA标签文库的测序质量较好。
图 4-图 6则是实施例 1的本发明的 DNA标签文库与对照例的标签文库的光强、 碱 基分布和错误率随循环数的变化的比较结果。 图 4显示了实施例 1的 DNA标签文库与对 照例的标签文库的各测序循环的光强的比较结果, 其中, A显示了实施例 1的本发明的 DNA标签文库的各测序循环的光强信号平均值, B显示了对照例的 DNA标签文库的各 轮测序循环的光强信号平均值。 如图 4所示, 横坐标(Cycle )表示循环数, 纵坐标表示 光强信号平均值( Signal Mean ) 。 图 5显示了根据本发明实施例的本发明的 DNA标签 文库与对照例的 DNA标签文库的 Solexa测序数据沿 Reads位置的碱基分布的比较结果, 其中, A显示了实施例 1的本发明的 DNA标签文库的 Solexa测序数据沿 Reads位置的 碱基百分比分布图, B显示了对照例的 DNA标签文库的 Solexa测序数据沿 Reads位置 的碱基百分比分布图。如图 5所示,横坐标表示沿 Reads的位置,纵坐标表示在该 Reads 的位置的各碱基百分比,该图显示了每次测序中测到的各种碱基比例。 图 6显示了根据 本发明实施例的本发明的 DNA标签文库与对照例的 DNA标签文库的 Solexa测序数据 沿 Reads位置的错误率的比较结果, 其中, A显示了实施例 1的本发明的 DNA标签文 库的 Solexa测序数据沿 Reads位置的错误率图, B显示了对照例的 DNA标签文库的 Solexa测序数据沿 Reads位置的错误率图。 如图 6所示, 橫坐标表示沿 Reads的位置, 纵坐标表示错误率(即在该 Reads的位置,测序错误发生的比例), 实线表示错误率(即 在该 Reads的位置, 测序错误发生的比例), 虚线表示无法分析的碱基比例, 该图显示 了不同文库在错误率上的区别。
由图 4-图 6可知, 从光强、 碱基分布和错误率随循环数的变化这三种参数来看, 实施 例 1的本发明的 DNA标签文库与对照例的标签文库并无明显差异。表明,使用本发明的 呈梯度长度变化的 DNA标签构建的 DNA标签文库和使用固定长度的常规标签构建的 DNA 标签文库, 在整体上并无明显差异, 使用本发明的 DNA标签并不会影响 DNA标签文库的 整体测序结果, 且测序从梯度标签过渡到***片段(Insert Fragment )时, 能显著提高在过 度处的碱基的质量值。 由此, 说明本发明的 DNA标签, 优于常规的多重核酸测序技术中的 常规标签。
而对于 HiSeq2000测序仪产量而言, 假设碱基簇密度为 300万 /tile, PF为 87% (其中, PF即 Pass Filter, 是指 Clean Data占 Raw Data的比例;), 则相对于对照例的标签文库, 本 发明的 DNA标签文库, 在一次 HiSeq测序仪运行时可以增加 83.5M的数据, 并且能增加 数据的可用性。
以上对本发明的 DNA标签文库与对照例的标签文库的各参数的比较结果表明,本发 明的 DNA标签及 DNA标签文库, 具有显著的创造性。
工业实用性
本发明的用于构建 DNA标签文库的 DNA标签、 DNA标签接头、 PCR标签引物、 DNA 标签文库及其制备方法、 确定 DNA样品序列信息的方法、 确定多种 DNA样品序列信息的 方法以及用于构建 DNA标签文库的试剂盒, 能够有效地应用于样品 DNA的 DNA测序文 库的构建以及测序, 并且能够有效减少数据产出偏向性的问题, 测序结果准确、 可靠, 稳 定性和可重复性好。
尽管本发明的具体实施方式已经得到详细的描述, 本领域技术人员将会理解。 根据已 经公开的所有教导, 可以对那些细节进行各种修改和替换, 这些改变均在本发明的保护范 围之内。 本发明的全部范围由所附权利要求及其任何等同物给出。
在本说明书的描述中, 参考术语 "一个实施例"、 "一些实施例"、 "示意性实施例"、 "示 例"、 "具体示例"、 或 "一些示例" 等的描述意指结合该实施例或示例描述的具体特征、 结 构、 材料或者特点包含于本发明的至少一个实施例或示例中。 在本说明书中, 对上述术语 的示意性表述不一定指的是相同的实施例或示例。 而且, 描述的具体特征、 结构、 材料或 者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。

Claims

权利要求书
I、 一组分离的 DNA标签, 其由 SEQ ID NO: ( 3N-2 )所示的核苷酸构成, 其中 N=l-6 的任意整数。
2、 一组分离的寡核苷酸, 所述分离的寡核苷酸具有第一链和第二链, 其中所述第一链 分别由 SEQ ID NO: ( 3N-1 )所示的核苷酸构成, 所述第二链分别由 SEQ ID NO: ( 3N )所 示的核苷酸构成,其中,对于相同的寡核苷酸,其第一链和第二链的 N取值相同,并且 N=l-6 的任意整数。
3、 一组分离的 PCR标签引物, 所述分离的 PCR标签引物具有具有第一链和第二链, 其中所述第一链分别由 SEQ ID NO: 39所示的核苷酸构成, 所述第二链的核苷酸序列如下 所示:
CTTCCGATCT,
其中, 所述 XXXXXX为权利要求 1所述的一组分离的 DNA标签的一种。
4、 一种构建 DNA标签文库的方法, 其特征在于, 包括以下步驟:
将 DNA样品片段化, 以便获得 DNA片段;
将所述 DNA片段进行末端修复, 以便获得经过末端修复的 DNA片段;
将所述经过末端修复的 DNA片段进行 3 '末端添加碱基 A, 以便获得 3'末端添加碱基 A 的 DNA片段;
将所述 3'末端添加碱基 A的 DNA片段与 DNA标签接头相连, 以便获得连接产物, 其 中所述 DNA标签接头包含选自权利要求 1所述的一组分离的 DNA标签的一种;
将所述连接产物进行扩增, 以便获得扩增产物; 以及
分离回收所述才广增产物, 所述扩增产物构成所述 DNA标签文库。
5、 根据权利要求 4所述的方法, 其特征在于, 所述片段化是通过选自雾化法、 酶切法 和超声打断法的至少一种进行的。
6、 根据权利要求 5所述的方法, 其特征在于, 利用超声打断法将所述 DNA样品片段 化。
7、根据权利要求 6所述的方法, 其特征在于, 利用 Covaris打断仪将所述 DNA样品片 段化。
8、 根据权利要求 4所述的方法, 其特征在于, 所述 DNA片段的长度为大约 500bp。
9、 根据权利要求 4所述的方法, 其特征在于, 所述 DNA样品为人 DNA样品。
10、根据权利要求 9所述的方法, 其特征在于, 所述 DNA样品为人基因组 DNA样品。
I I、根据权利要求 4所述的方法, 其特征在于, 所述末端修复是利用 Klenow片段、 T4 DNA聚合酶和 T4多核苷酸激酶进行的,其中所述 Klenow片段具有 5 '→3 '聚合酶活性和 3 ' →5 '外切酶活性, 但缺少 5 '→3'外切酶活性,
12、 根据权利要求 3所述的方法, 其特征在于, 利用 Klenow (3'-5' exo-)将所述经过末 端修复的 DNA片段进行 3'端添加碱基 A。
13、 根据权利要求 12所述的方法, 其特征在于, 将所述经过末端修复的 DNA片段的 两条寡核苷酸链的 3'末端均添加碱基 。
14、 根据权利要求 4所述的方法, 其特征在于, 将所述 3'末端添加碱基 A的 DNA片 段与 DNA标签接头相连是利用 T4 DNA连接酶进行的。
15、 根据权利要求 4所述的方法, 其特征在于, 在所述 3'末端添加碱基 A的 DNA片 段的 3'末端和 5'末端的至少一处连接 DNA标签接头。
16、 根据权利要求 15所述的方法, 其特征在于, 在所述 3'末端添加碱基 A的 DNA片 段的 3'末端连接 DNA标签接头。
17、 根据权利要求 4所述的方法, 其特征在于, 所述 DNA标签接头为选自权利要求 2 所述的分离的寡核苷酸的一种。
18、 根据权利要求 4所述的方法, 其特征在于, 将所述连接产物进行扩增之前, 进一 步包括对所述连接产物进行片段选择的步骤。
19、 根据权利要求 18所述的方法, 其特征在于, 利用 2%琼脂糖凝胶电泳进行所述片 段选择。
20、 根据权利要求 19所述的方法, 其特征在于, 所述连接产物的长度为约 620bp。
21、根据权利要求 4所述的方法,其特征在于,将所述连接产物进行扩增,是通过 PCR 反应进行的, 所述 PCR反应采用分别具有如 SEQ ID NO: 39和 SEQ ID NO: 40 所示核苷 酸序列的引物。
22、 根据权利要求 4所述的方法, 其特征在于, 通过选自琼脂糖凝胶电泳、 磁珠纯化 和纯化柱纯化的至少一种分离所述扩增产物。
23、 根据权利要求 4所述的方法, 其特征在于, 所述 增产物的长度为大约 620bp。
24、 一种 DNA标签文库, 其是通过根据权利要求 3-23任一项所述的方法获得的。
25、 一种确定 DNA样品序列信息的方法, 其包括下列步骤:
根据权利要求 3-23任一项所述的方法构建所述 DNA样品的 DNA标签文库; 以及 对所述 DNA标签文库进行测序, 以便确定所述 DNA样品的序列信息。
26、 根据权利要求 25所述的方法, 其特征在于, 利用 Solexa测序技术对所述 DNA标 签文库进行测序。
27、 一种确定多种 DNA样品序列信息的方法, 其包括下列步驟:
针对所述多种 DNA样品的每一种, 分别独立地根据权利要求 3-23任一项所述的方法, 构建所述 DNA样品的 DNA标签文库, 其中, 不同的 DNA样品采用相互不同并且已知序 列的 DNA标签, 其中所述多种为 2-6种;
将所述多种 DNA样品的 DNA标签文库进行组合, 以便获得 DNA标签文库混合物; 对所述 DNA标签文库混合物进行测序, 以便获得所述 DNA样品的序列信息以及所述 标签的序列信息; 以及 基于所述标签的序列信息对所述 DNA样品的序列信息进行分类,以便硝定所述多种样 品的 DNA序列信息。
28、 根据权利要求 27所述的方法, 其特征在于, 利用 Solexa测序技术对所述 DNA标 签文库混合物进行测序。
29、 一种用于构建 DNA标签文库的试剂盒, 其包括:
6种分离的寡核苷酸, 所述分离的寡核苷酸具有第一链和第二链, 其中所述第一链分别 由 SEQ ID NO: ( 3N-1 )所示的核苷酸构成, 所述第二链分别由 SEQ ID NO: ( 3N )所示的 核苷酸构成, 其中, 对于相同的寡核苷酸, 其第一链和第二链的 N取值相同, 并且 N=l-6 的任意整数,
其中, 所述 6种分离的寡核苷酸分别设置在不同的容器中。
PCT/CN2012/071893 2011-03-03 2012-03-02 Dna标签及其应用 WO2012116661A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110050238.1 2011-03-03
CN201110050238.1A CN102653784B (zh) 2011-03-03 2011-03-03 用于多重核酸测序的标签及其使用方法

Publications (1)

Publication Number Publication Date
WO2012116661A1 true WO2012116661A1 (zh) 2012-09-07

Family

ID=46729514

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/071893 WO2012116661A1 (zh) 2011-03-03 2012-03-02 Dna标签及其应用

Country Status (3)

Country Link
CN (1) CN102653784B (zh)
HK (1) HK1175208A1 (zh)
WO (1) WO2012116661A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104357918A (zh) * 2014-11-25 2015-02-18 北京阅微基因技术有限公司 血浆游离dna文库的构建方法
WO2018217625A1 (en) * 2017-05-23 2018-11-29 Bio-Rad Laboratories, Inc. Molecular barcoding

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102978205B (zh) * 2012-11-19 2014-08-20 北京诺禾致源生物信息科技有限公司 一种应用于标记开发的高通量测序的接头及其运用方法
CN111118121A (zh) * 2013-12-05 2020-05-08 生捷科技控股公司 图案化阵列的制备
CN103902852B (zh) * 2014-03-21 2017-03-22 深圳华大基因科技有限公司 基因表达的定量方法及装置
CN105401222A (zh) * 2015-12-30 2016-03-16 安诺优达基因科技(北京)有限公司 一种构建测序用dna文库的方法
CN105483267B (zh) * 2016-01-15 2018-12-04 古博 血浆游离DNA双分子标记、标记和检测血浆cfDNA的方法及其用途
CN105506748B (zh) * 2016-01-18 2018-11-27 北京百迈客生物科技有限公司 一种dna高通量测序建库方法
CN106995836B (zh) * 2016-01-22 2020-08-25 益善生物技术股份有限公司 二代测序样品前处理的引物和方法以及试剂盒
WO2017202389A1 (zh) * 2016-05-27 2017-11-30 深圳市海普洛斯生物科技有限公司 一种适用于超微量dna测序的接头及其应用
CN106047861A (zh) * 2016-06-07 2016-10-26 苏州贝斯派生物科技有限公司 一种快速修复损伤dna的试剂、及其制备方法与应用
CN107475352A (zh) * 2016-06-08 2017-12-15 深圳华大基因股份有限公司 一种用于边合成边测序的测序仪的通用型pcr扩增融合引物
CN109996877A (zh) * 2016-12-16 2019-07-09 深圳华大基因股份有限公司 一种用于核酸样品标识的基因标签、试剂盒及其应用
CN109321984B (zh) * 2017-08-01 2022-08-23 浙江安诺优达生物科技有限公司 一种测序用dna文库
CN110468188B (zh) * 2019-08-22 2023-08-22 广州微远医疗器械有限公司 用于二代测序的标签序列集及其设计方法和应用
CN111235244A (zh) * 2019-11-28 2020-06-05 广州微远基因科技有限公司 测序内标分子及其制备方法和应用

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003052137A2 (de) * 2001-12-19 2003-06-26 Gnothis Holding Sa Evaneszenz-basierendes multiplex-sequenzierungsverfahren
WO2005054504A1 (de) * 2003-12-02 2005-06-16 Universität Zu Köln Verfahren zur multiplex-sequenzierung
US20060263789A1 (en) * 2005-05-19 2006-11-23 Robert Kincaid Unique identifiers for indicating properties associated with entities to which they are attached, and methods for using

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101921840B (zh) * 2010-06-30 2014-06-25 深圳华大基因科技有限公司 一种基于dna分子标签技术和dna不完全打断策略的pcr测序方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003052137A2 (de) * 2001-12-19 2003-06-26 Gnothis Holding Sa Evaneszenz-basierendes multiplex-sequenzierungsverfahren
WO2005054504A1 (de) * 2003-12-02 2005-06-16 Universität Zu Köln Verfahren zur multiplex-sequenzierung
US20060263789A1 (en) * 2005-05-19 2006-11-23 Robert Kincaid Unique identifiers for indicating properties associated with entities to which they are attached, and methods for using

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANDREW M. SMITH ET AL.: "Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples, e142", NUCLEIC ACIDS RESEARCH, vol. 28, no. 13, 11 May 2010 (2010-05-11) *
DAVID W. CRAIG ET AL.: "Identification of genetic variants using bar-coded multiplexed sequencing", NATURE METHODS, vol. 5, no. 10, 14 September 2008 (2008-09-14), pages 887 - 893 *
RICHARD CRONN ET AL.: "Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by synthesis technology, e122", NUCLEIC ACIDS RESEARCH, vol. 36, no. 19, 27 August 2008 (2008-08-27) *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104357918A (zh) * 2014-11-25 2015-02-18 北京阅微基因技术有限公司 血浆游离dna文库的构建方法
CN104357918B (zh) * 2014-11-25 2016-08-17 北京阅微基因技术有限公司 血浆游离dna文库的构建方法
WO2018217625A1 (en) * 2017-05-23 2018-11-29 Bio-Rad Laboratories, Inc. Molecular barcoding
CN110678547A (zh) * 2017-05-23 2020-01-10 生物辐射实验室股份有限公司 分子条码化
US10752894B2 (en) 2017-05-23 2020-08-25 Bio-Rad Laboratories, Inc. Molecular barcoding
US11248227B2 (en) 2017-05-23 2022-02-15 Bio-Rad Laboratories, Inc. Molecular barcoding
CN110678547B (zh) * 2017-05-23 2023-10-31 生物辐射实验室股份有限公司 分子条码化
US11834655B2 (en) 2017-05-23 2023-12-05 Bio-Rad Laboratories, Inc. Molecular barcoding

Also Published As

Publication number Publication date
HK1175208A1 (zh) 2013-06-28
CN102653784B (zh) 2015-01-21
CN102653784A (zh) 2012-09-05

Similar Documents

Publication Publication Date Title
WO2012116661A1 (zh) Dna标签及其应用
AU2022202739B2 (en) High-Throughput Single-Cell Sequencing With Reduced Amplification Bias
EP3559267B1 (en) Method for molecular barcoding of nucleic acids of single cells
US10400279B2 (en) Method for constructing a sequencing library based on a single-stranded DNA molecule and application thereof
CN105400776B (zh) 寡核苷酸接头及其在构建核酸测序单链环状文库中的应用
EP3628732B1 (en) Transposase compositions for reduction of insertion bias
EP2725125B1 (en) High throughput methylation detection method
TW201321518A (zh) 微量核酸樣本的庫製備方法及其應用
CN111379031B (zh) 核酸文库构建方法、得到的核酸文库及其用途
WO2013071876A1 (zh) 高通量测序文库的构建方法及其应用
WO2012037880A1 (zh) Dna标签及其应用
CA2898456A1 (en) Methods and compositions for nucleic acid sequencing
EP3555305A1 (en) Method for increasing throughput of single molecule sequencing by concatenating short dna fragments
WO2012037882A1 (zh) Dna标签及其应用
WO2020165433A1 (en) Haplotagging - haplotype phasing and single-tube combinatorial barcoding of nucleic acid molecules using bead-immobilized tn5 transposase
WO2012126398A1 (zh) Dna标签及其用途
WO2012037884A1 (zh) Dna标签及其应用
US20220155319A1 (en) Use of nanoexpression to interrogate antibody repertoires
WO2012037881A1 (zh) 核酸标签及其应用
WO2012037875A1 (zh) Dna标签及其应用
CN108300764B (zh) 一种建库方法及snp分型方法
US10982278B2 (en) Methods for linking polynucleotides
WO2012037879A1 (zh) 核酸标签及其应用
WO2014086037A1 (zh) 构建核酸测序文库的方法及其应用
WO2021050717A1 (en) Immune cell sequencing methods

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12752955

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12752955

Country of ref document: EP

Kind code of ref document: A1